Wikitech
labswiki
https://wikitech.wikimedia.org/wiki/Main_Page
MediaWiki 1.46.0-wmf.22
first-letter
Media
Special
Talk
User
User talk
Wikitech
Wikitech talk
File
File talk
MediaWiki
MediaWiki talk
Template
Template talk
Help
Help talk
Category
Category talk
Obsolete
Obsolete talk
OfficeIT
OfficeIT talk
Tool
Tool talk
Nova Resource
Nova Resource Talk
Heira
Heira Talk
TimedText
TimedText talk
Module
Module talk
Sandbox
0
3240
2398863
2398861
2026-04-05T20:08:46Z
Wooze
38071
Restored revision 2355113 by [[Special:Contributions/Plutus|Plutus]] ([[User talk:Plutus|talk]]) (TG)
2398863
wikitext
text/x-wiki
{{Please leave this line alone (sandbox heading)}}
phlij3i0lq7l17sacctmpzowd8epftu
Deployments
0
4108
2398868
2398838
2026-04-06T07:37:00Z
ScheduleDeploymentBot
37566
Add [[gerrit:1264631]] to Monday, April 06 UTC morning backport window
2398868
wikitext
text/x-wiki
{{Navigation MediaWiki deployment}}
This page tracks '''upcoming''' '''deployments''' of software to the [[:m:Special:SiteMatrix|Wikimedia Foundation servers]].
== Getting started ==
Ensure you joined the {{irc|wikimedia-operations}} IRC channel as all deployment-related communications happen there.
If you need help, contact [[:mw:Wikimedia Release Engineering Team|Release Engineering]] on IRC at {{irc|wikimedia-releng}}; and ping Tyler (<code>thcipriani</code>).
* '''MediaWiki is deployed weekly''' through the [[/Train|Deployment Train]]. Other services follow their own schedule.
* '''Times are pinned to San Francisco''', thus the UTC time changes in March and November per [[:en:Daylight saving time in the United States|DST]].
* '''Prefer regular [[Backport windows]]''' over adding new windows. To request deployment of a config change or backport, add your username and Gerrit URL to one of the backport windows on this page. You must be online in #wikimedia-operations on IRC during your deployment and install [[WikimediaDebug]] ahead of time. The #wikimedia-operations channel requires you to [[:m:IRC/Instructions#Register your nickname, identify, and enforce|register your nickname]] before you can join.
** You can use the '''backport scheduling tool''' to more easily edit this page: <div style="text-align: center; margin: 1em 0">{{Clickable button 2|:toollabs:schedule-deployment|Schedule a backport|class=mw-ui-progressive}}</div>
* Tasks that meet [[/Inclusion criteria|Inclusion criteria]] '''require their own windows''', which includes long-running tasks. '''Schedule more time''' than you think you need to account for delays and set backs, we recommend one hour for most tasks.
**To create or modify a recurring deploy window, send a patchset to [[:gitlab:repos/releng/release/-/blob/main/make-deployment-calendar/deployments-calendar.yaml|deployments-calendar.yaml file]] in <code>repos/releng/release.git</code>.
**To create an one-off window, simply edit this page accordingly
** '''Announce''' changes to the [[mail:ops|ops mailing list]] ahead of time if you anticipate or are uncertain about noticeable impacts to database load, HTTP caching, or the introduction of new cookies.
** '''Announce''' deployments of major features to the community via [[:m:Tech/News/Next|Tech News]] and/or via other [[:mw:Wikimedia_Product_Guidance/Communication_channels|Product communication channels]].
* '''Something went wrong?''' See [[Incident response]]. Is there a user-impacting problem? Communicate in the {{irc|wikimedia-operations}} IRC channel. If there is a Phabricator task, ensure [[:phab:tag/wikimedia-incident/|#Wikimedia-Incident]] is tagged, and consider setting the [[:mw:Phabricator/Project_management#Priority_levels|Unbreak Now]] priority.
__TOC__
{{anchor|Next Week|Near Term|Near term|Near-term}}{{clear}}
[[Category:Deployment]]
{{Note|content=Subscribe in Google Calendar via <code>wikimedia.org_rudis09ii2mm5fk4hgdjeh1u64@group.calendar.google.com</code>.<br>This may not include one-off windows. '''If there are differences, then the wiki page is canonical and correct'''.}}
==Week of April 06==
==={{Deployment_day|date=2026-04-05}}===
{{Deployment calendar event card
|when=2026-04-05 00:00 SF
|length=24
|window=No deploys all day! See [[Deployments/Emergencies]] if things are broken.
|who=
|what=No Deploys
}}
==={{Deployment_day|date=2026-04-06}}===
{{Deployment calendar event card
|when=2026-04-06 00:00 SF
|length=1
|window=[[Backport windows|UTC morning backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Amir1|Amir}}, {{ircnick|Urbanecm|Martin}}, {{ircnick|awight|Adam}}
|what={{ircnick|katherine_g|katherine_g}}
{{deploy|type=config|gerrit=1264631|title=Set live configuration for Extension:PersonalDashboard on English Wikipedia|status=}} - {{phabricator|T421415}}
{{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-04-06 03:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC mid-day)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-04-06 06:00 SF
|length=1
|window=[[Backport windows|UTC afternoon backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Lucas_WMDE|Lucas}}, {{ircnick|Urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-04-06 07:30 SF
|length=0.5
|window=Test Kitchen Experiment Deployment Window
|who=Test Kitchen
|what=Automatic start/stop of active experiments and instruments managed by [https://wikitech.wikimedia.org/wiki/Test_Kitchen Test Kitchen].
}}
{{Deployment calendar event card
|when=2026-04-06 08:30 SF
|length=0.5
|window=Wikimedia Portals Update
|who={{ircnick|jan_drewniak|Jan Drewniak}}
|what=Weekly window for the portals page: https://www.wikipedia.org/
}}
{{Deployment calendar event card
|when=2026-04-06 10:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC late)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-04-06 10:00 SF
|length=0.5
|window=Wikidata Query Service weekly deploy
|who={{ircnick|ryankemper|Ryan}}
|what=...
}}
{{Deployment calendar event card
|when=2026-04-06 13:00 SF
|length=1
|window=[[Backport windows|UTC late backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|RoanKattouw|Roan}}, {{ircnick|Urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}, {{ircnick|kindrobot|Stef}}, {{ircnick|cjming|Clare}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-04-06 14:00 SF
|length=2
|window=Weekly Security deployment window
|who={{ircnick|Reedy|Sam}}, {{ircnick|sbassett|Scott}}, {{ircnick|Maryum|Maryum}}, {{ircnick|manfredi|Manfredi}}
|what=Held deployment window for Security-team related deploys.
}}
{{Deployment calendar event card
|when=2026-04-06 16:00 SF
|length=1
|window=Web Team deployment window
|who=Web Team
|what=NOTE: often skipped, the web team does not typically check IRC so assume this is not being used if 5 minutes past the start
}}
{{Deployment calendar event card
|when=2026-04-06 19:00 SF
|length=1
|window=Automatic branching of MediaWiki, extensions, skins, and vendor – see [[Heterogeneous deployment/Train deploys]]
|who=N/A
|what=Branch <code>wmf/1.46.0-wmf.23</code>
}}
{{Deployment calendar event card
|when=2026-04-06 20:00 SF
|length=1
|window=Automatic deployment of MediaWiki, extensions, skins, and vendor to testwikis only – see [[Heterogeneous deployment/Train deploys]]
|who=N/A
|what=Deploy <code>wmf/1.46.0-wmf.23</code> to testwikis
}}
{{Deployment calendar event card
|when=2026-04-06 21:00 SF
|length=1
|window=Automatic removal of all obsolete MediaWiki versions from the deployment and bare metal servers (except the most-recent obsolete version)
|who=N/A
|what=Runs <code>scap clean auto</code>
}}
{{Deployment calendar event card
|when=2026-04-06 23:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC early)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-04-06 23:00 SF
|length=0.5
|window=Primary database switchover
|who={{ircnick|marostegui|Manuel Arostegui}}, {{ircnick|Amir1|Amir}}, {{ircnick|federico3|Federico Ceratto}}
|what=Held deployment window for database primary masters maintenance
}}
==={{Deployment_day|date=2026-04-07}}===
{{Deployment calendar event card
|when=2026-04-07 00:00 SF
|length=1
|window=[[Backport windows|UTC morning backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Amir1|Amir}}, {{ircnick|Urbanecm|Martin}}, {{ircnick|awight|Adam}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-04-07 03:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC mid-day)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-04-07 05:00 SF
|length=1
|window=Mobileapps/RESTBase/Wikifeeds
|who=Content Transform Team
|what=Content transform team node services (mobileapps/wikifeeds)
}}
{{Deployment calendar event card
|when=2026-04-07 06:00 SF
|length=1
|window=[[Backport windows|UTC afternoon backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Lucas_WMDE|Lucas}}, {{ircnick|Urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-04-07 07:00 SF
|length=0.5
|window=Test Kitchen UI Deployment Window
|who=Experimentation Platform Team
|what=Deployment of Test Kitchen UI (fka MPIC)
}}
{{Deployment calendar event card
|when=2026-04-07 07:30 SF
|length=0.5
|window=Test Kitchen Experiment Deployment Window
|who=Test Kitchen
|what=Automatic start/stop of active experiments and instruments managed by [https://wikitech.wikimedia.org/wiki/Test_Kitchen Test Kitchen].
}}
{{Deployment calendar event card
|when=2026-04-07 08:00 SF
|length=1
|window=SRE Collaboration Services office hours
|who={{ircnick|jelto|Jelto}}, {{ircnick|arnoldokoth|Arnold}}, {{ircnick|mutante|Daniel}}, {{ircnick|arnaudb|Arnaud}}
|what=Services including Gerrit, Phorge (Phabricator), GitLab
}}
{{Deployment calendar event card
|when=2026-04-07 09:00 SF
|length=1
|window=[[Puppet request window]]<br/><small>'''(Max 6 patches)'''</small>
|who={{ircnick|jhathaway|JHathaway}}, {{ircnick|rzl|Reuven}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to Puppet change''
}}
{{Deployment calendar event card
|when=2026-04-07 10:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC late)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-04-07 11:00 SF
|length=2
|window=MediaWiki train - Utc-7+Utc-0 Version
|who={{ircnick|dancy|Ahmon}}, {{ircnick|jnuche|Jaime}}
|what=[[mw:MediaWiki 1.46/Roadmap#Schedule for the deployments|1.46 schedule]]
{{DeployOneWeekMini|1.46.0-wmf.22->1.46.0-wmf.23|1.46.0-wmf.22|1.46.0-wmf.22}}
* group0 to [[mw:MediaWiki_1.46/wmf.23|1.46.0-wmf.23]]
* '''Blockers: {{phabricator|T420481}}'''
}}
{{Deployment calendar event card
|when=2026-04-07 13:00 SF
|length=1
|window=[[Backport windows|UTC late backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|RoanKattouw|Roan}}, {{ircnick|Urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}, {{ircnick|kindrobot|Stef}}, {{ircnick|cjming|Clare}}
|what={{ircnick|hyang|hyang}}
{{deploy|type=config|gerrit=1264856|title=REST: Publish ReadingLists v0 module in REST Sandbox|status=}} - {{phabricator|T419619}}
{{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-04-07 14:00 SF
|length=1
|window=Web Team deployment window
|who=Web Team
|what=NOTE: often skipped, the web team does not typically check IRC so assume this is not being used if 5 minutes past the start
}}
{{Deployment calendar event card
|when=2026-04-07 23:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC early)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
==={{Deployment_day|date=2026-04-08}}===
{{Deployment calendar event card
|when=2026-04-08 00:00 SF
|length=1
|window=[[Backport windows|UTC morning backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Amir1|Amir}}, {{ircnick|Urbanecm|Martin}}, {{ircnick|awight|Adam}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-04-08 01:00 SF
|length=2
|window=MediaWiki train - Utc-7+Utc-0 Version (secondary timeslot)
|who={{ircnick|dancy|Ahmon}}, {{ircnick|jnuche|Jaime}}
|what=[[mw:MediaWiki 1.46/Roadmap#Schedule for the deployments|1.46 schedule]]
{{DeployOneWeekMini|1.46.0-wmf.23|1.46.0-wmf.22->1.46.0-wmf.23|1.46.0-wmf.22}}
* group1 to [[mw:MediaWiki_1.46/wmf.23|1.46.0-wmf.23]]
* '''Blockers: {{phabricator|T420481}}'''
}}
{{Deployment calendar event card
|when=2026-04-08 03:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC mid-day)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-04-08 04:00 SF
|length=1
|window=[[mw:Services|Services]] – [[Citoid]] / [[Zotero]]
|who=Marielle ({{ircnick|mvolz}})
|what=See [[mw:Citoid|Citoid]]
}}
{{Deployment calendar event card
|when=2026-04-08 06:00 SF
|length=1
|window=[[Backport windows|UTC afternoon backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Lucas_WMDE|Lucas}}, {{ircnick|Urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-04-08 07:00 SF
|length=1
|window=Wikifunctions Services UTC Afternoon
|who=Abstract Wikipedia team (Africa, Europe, Eastern Americas)
|what=Wikifunctions back-end k8s services
}}
{{Deployment calendar event card
|when=2026-04-08 07:30 SF
|length=0.5
|window=Test Kitchen Experiment Deployment Window
|who=Test Kitchen
|what=Automatic start/stop of active experiments and instruments managed by [https://wikitech.wikimedia.org/wiki/Test_Kitchen Test Kitchen].
}}
{{Deployment calendar event card
|when=2026-04-08 10:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC late)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-04-08 11:00 SF
|length=2
|window=MediaWiki train - Utc-7+Utc-0 Version
|who={{ircnick|dancy|Ahmon}}, {{ircnick|jnuche|Jaime}}
|what=[[mw:MediaWiki 1.46/Roadmap#Schedule for the deployments|1.46 schedule]]
{{DeployOneWeekMini|1.46.0-wmf.23|1.46.0-wmf.22->1.46.0-wmf.23|1.46.0-wmf.22}}
* group1 to [[mw:MediaWiki_1.46/wmf.23|1.46.0-wmf.23]]
* '''Blockers: {{phabricator|T420481}}'''
}}
{{Deployment calendar event card
|when=2026-04-08 13:00 SF
|length=1
|window=[[Backport windows|UTC late backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|RoanKattouw|Roan}}, {{ircnick|Urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}, {{ircnick|kindrobot|Stef}}, {{ircnick|cjming|Clare}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-04-08 14:00 SF
|length=1
|window=Wikifunctions Services UTC Late
|who=Abstract Wikipedia team (North and South America)
|what=Wikifunctions back-end k8s services
}}
{{Deployment calendar event card
|when=2026-04-08 15:00 SF
|length=1
|window=Web Team deployment window
|who=Web Team
|what=NOTE: often skipped, the web team does not typically check IRC so assume this is not being used if 5 minutes past the start
}}
{{Deployment calendar event card
|when=2026-04-08 23:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC early)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-04-08 23:00 SF
|length=0.5
|window=Primary database switchover
|who={{ircnick|marostegui|Manuel Arostegui}}, {{ircnick|Amir1|Amir}}, {{ircnick|federico3|Federico Ceratto}}
|what=Held deployment window for database primary masters maintenance
}}
==={{Deployment_day|date=2026-04-09}}===
{{Deployment calendar event card
|when=2026-04-09 00:00 SF
|length=1
|window=[[Backport windows|UTC morning backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Amir1|Amir}}, {{ircnick|Urbanecm|Martin}}, {{ircnick|awight|Adam}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-04-09 01:00 SF
|length=2
|window=MediaWiki train - Utc-7+Utc-0 Version (secondary timeslot)
|who={{ircnick|dancy|Ahmon}}, {{ircnick|jnuche|Jaime}}
|what=[[mw:MediaWiki 1.46/Roadmap#Schedule for the deployments|1.46 schedule]]
{{DeployOneWeekMini|1.46.0-wmf.23|1.46.0-wmf.23|1.46.0-wmf.22->1.46.0-wmf.23}}
* group2 to [[mw:MediaWiki_1.46/wmf.23|1.46.0-wmf.23]]
* '''Blockers: {{phabricator|T420481}}'''
}}
{{Deployment calendar event card
|when=2026-04-09 03:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC mid-day)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-04-09 05:00 SF
|length=1
|window=Mobileapps/RESTBase/Wikifeeds
|who=Content Transform Team
|what=Content transform team node services (mobileapps/wikifeeds)
}}
{{Deployment calendar event card
|when=2026-04-09 06:00 SF
|length=1
|window=[[Backport windows|UTC afternoon backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Lucas_WMDE|Lucas}}, {{ircnick|Urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-04-09 07:30 SF
|length=0.5
|window=Test Kitchen Experiment Deployment Window
|who=Test Kitchen
|what=Automatic start/stop of active experiments and instruments managed by [https://wikitech.wikimedia.org/wiki/Test_Kitchen Test Kitchen].
}}
{{Deployment calendar event card
|when=2026-04-09 08:00 SF
|length=1
|window=Train log triage
|who={{ircnick|dancy|Ahmon}}, {{ircnick|jnuche|Jaime}}
|what=See [[Heterogeneous deployment/Train deploys#Breakage]]
}}
{{Deployment calendar event card
|when=2026-04-09 09:00 SF
|length=1
|window=[[Puppet request window]]<br/><small>'''(Max 6 patches)'''</small>
|who={{ircnick|jhathaway|JHathaway}}, {{ircnick|rzl|Reuven}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to Puppet change''
}}
{{Deployment calendar event card
|when=2026-04-09 10:00 SF
|length=1
|window=Cloud Services/Technical Documentation weekly deploy (Toolhub, Developer portal, Striker)
|who={{ircnick|bd808}}
|what=...
}}
{{Deployment calendar event card
|when=2026-04-09 10:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC late)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-04-09 11:00 SF
|length=2
|window=MediaWiki train - Utc-7+Utc-0 Version
|who={{ircnick|dancy|Ahmon}}, {{ircnick|jnuche|Jaime}}
|what=[[mw:MediaWiki 1.46/Roadmap#Schedule for the deployments|1.46 schedule]]
{{DeployOneWeekMini|1.46.0-wmf.23|1.46.0-wmf.23|1.46.0-wmf.22->1.46.0-wmf.23}}
* group2 to [[mw:MediaWiki_1.46/wmf.23|1.46.0-wmf.23]]
* '''Blockers: {{phabricator|T420481}}'''
}}
{{Deployment calendar event card
|when=2026-04-09 13:00 SF
|length=1
|window=[[Backport windows|UTC late backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|RoanKattouw|Roan}}, {{ircnick|Urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}, {{ircnick|kindrobot|Stef}}, {{ircnick|cjming|Clare}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-04-09 14:00 SF
|length=1
|window=Web Team deployment window
|who=Web Team
|what=NOTE: often skipped, the web team does not typically check IRC so assume this is not being used if 5 minutes past the start
}}
{{Deployment calendar event card
|when=2026-04-09 23:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC early)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
==={{Deployment_day|date=2026-04-10}}===
{{Deployment calendar event card
|when=2026-04-10 00:00 SF
|length=24
|window=No deploys all day! See [[Deployments/Emergencies]] if things are broken.
|who=
|what=No Deploys
}}
{{Deployment calendar event card
|when=2026-04-10 04:00 SF
|length=0.5
|window=GitLab version upgrades
|who={{ircnick|jelto|Jelto}}, {{ircnick|arnoldokoth|Arnold}}, {{ircnick|mutante|Daniel}}, {{ircnick|arnaudb|Arnaud}}
|what=GitLab version upgrades
}}
==={{Deployment_day|date=2026-04-11}}===
{{Deployment calendar event card
|when=2026-04-11 00:00 SF
|length=24
|window=No deploys all day! See [[Deployments/Emergencies]] if things are broken.
|who=
|what=No Deploys
}}
==Week of April 13==
==={{Deployment_day|date=2026-04-12}}===
{{Deployment calendar event card
|when=2026-04-12 00:00 SF
|length=24
|window=No deploys all day! See [[Deployments/Emergencies]] if things are broken.
|who=
|what=No Deploys
}}
==={{Deployment_day|date=2026-04-13}}===
{{Deployment calendar event card
|when=2026-04-13 00:00 SF
|length=1
|window=[[Backport windows|UTC morning backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Amir1|Amir}}, {{ircnick|Urbanecm|Martin}}, {{ircnick|awight|Adam}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-04-13 03:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC mid-day)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-04-13 06:00 SF
|length=1
|window=[[Backport windows|UTC afternoon backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Lucas_WMDE|Lucas}}, {{ircnick|Urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-04-13 07:30 SF
|length=0.5
|window=Test Kitchen Experiment Deployment Window
|who=Test Kitchen
|what=Automatic start/stop of active experiments and instruments managed by [[Test Kitchen]].
}}
{{Deployment calendar event card
|when=2026-04-13 08:30 SF
|length=0.5
|window=Wikimedia Portals Update
|who={{ircnick|jan_drewniak|Jan Drewniak}}
|what=Weekly window for the portals page: https://www.wikipedia.org/
}}
{{Deployment calendar event card
|when=2026-04-13 10:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC late)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-04-13 10:00 SF
|length=0.5
|window=Wikidata Query Service weekly deploy
|who={{ircnick|ryankemper|Ryan}}
|what=...
}}
{{Deployment calendar event card
|when=2026-04-13 13:00 SF
|length=1
|window=[[Backport windows|UTC late backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|RoanKattouw|Roan}}, {{ircnick|Urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}, {{ircnick|kindrobot|Stef}}, {{ircnick|cjming|Clare}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-04-13 14:00 SF
|length=2
|window=Weekly Security deployment window
|who={{ircnick|Reedy|Sam}}, {{ircnick|sbassett|Scott}}, {{ircnick|Maryum|Maryum}}, {{ircnick|manfredi|Manfredi}}
|what=Held deployment window for Security-team related deploys.
}}
{{Deployment calendar event card
|when=2026-04-13 16:00 SF
|length=1
|window=Readers deployment window
|who=Readers
|what=NOTE: often skipped, the reader teams do not typically check IRC so assume this is not being used if 5 minutes past the start
}}
{{Deployment calendar event card
|when=2026-04-13 19:00 SF
|length=1
|window=Automatic branching of MediaWiki, extensions, skins, and vendor – see [[Heterogeneous deployment/Train deploys]]
|who=N/A
|what=Branch <code>wmf/1.46.0-wmf.24</code>
}}
{{Deployment calendar event card
|when=2026-04-13 20:00 SF
|length=1
|window=Automatic deployment of MediaWiki, extensions, skins, and vendor to testwikis only – see [[Heterogeneous deployment/Train deploys]]
|who=N/A
|what=Deploy <code>wmf/1.46.0-wmf.24</code> to testwikis
}}
{{Deployment calendar event card
|when=2026-04-13 21:00 SF
|length=1
|window=Automatic removal of all obsolete MediaWiki versions from the deployment and bare metal servers (except the most-recent obsolete version)
|who=N/A
|what=Runs <code>scap clean auto</code>
}}
{{Deployment calendar event card
|when=2026-04-13 23:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC early)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-04-13 23:00 SF
|length=0.5
|window=Primary database switchover
|who={{ircnick|marostegui|Manuel Arostegui}}, {{ircnick|Amir1|Amir}}, {{ircnick|federico3|Federico Ceratto}}
|what=Held deployment window for database primary masters maintenance
}}
==={{Deployment_day|date=2026-04-14}}===
{{Deployment calendar event card
|when=2026-04-14 00:00 SF
|length=1
|window=[[Backport windows|UTC morning backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Amir1|Amir}}, {{ircnick|Urbanecm|Martin}}, {{ircnick|awight|Adam}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-04-14 03:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC mid-day)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-04-14 05:00 SF
|length=1
|window=Mobileapps/RESTBase/Wikifeeds
|who=Content Transform Team
|what=Content transform team node services (mobileapps/wikifeeds)
}}
{{Deployment calendar event card
|when=2026-04-14 06:00 SF
|length=1
|window=[[Backport windows|UTC afternoon backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Lucas_WMDE|Lucas}}, {{ircnick|Urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-04-14 07:00 SF
|length=0.5
|window=Test Kitchen UI Deployment Window
|who=Experimentation Platform Team
|what=Deployment of Test Kitchen UI (fka MPIC)
}}
{{Deployment calendar event card
|when=2026-04-14 07:30 SF
|length=0.5
|window=Test Kitchen Experiment Deployment Window
|who=Test Kitchen
|what=Automatic start/stop of active experiments and instruments managed by [[Test Kitchen]].
}}
{{Deployment calendar event card
|when=2026-04-14 08:00 SF
|length=1
|window=SRE Collaboration Services office hours
|who={{ircnick|jelto|Jelto}}, {{ircnick|arnoldokoth|Arnold}}, {{ircnick|mutante|Daniel}}, {{ircnick|arnaudb|Arnaud}}
|what=Services including Gerrit, Phorge (Phabricator), GitLab
}}
{{Deployment calendar event card
|when=2026-04-14 09:00 SF
|length=1
|window=[[Puppet request window]]<br/><small>'''(Max 6 patches)'''</small>
|who={{ircnick|jhathaway|JHathaway}}, {{ircnick|rzl|Reuven}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to Puppet change''
}}
{{Deployment calendar event card
|when=2026-04-14 10:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC late)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-04-14 11:00 SF
|length=2
|window=MediaWiki train - Utc-7 Version
|who={{ircnick|dduvall|Dan}}, {{ircnick|dancy|Ahmon}}
|what=[[mw:MediaWiki 1.46/Roadmap#Schedule for the deployments|1.46 schedule]]
{{DeployOneWeekMini|1.46.0-wmf.23->1.46.0-wmf.24|1.46.0-wmf.23|1.46.0-wmf.23}}
* group0 to [[mw:MediaWiki_1.46/wmf.24|1.46.0-wmf.24]]
* '''Blockers: {{phabricator|T420482}}'''
}}
{{Deployment calendar event card
|when=2026-04-14 13:00 SF
|length=1
|window=[[Backport windows|UTC late backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|RoanKattouw|Roan}}, {{ircnick|Urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}, {{ircnick|kindrobot|Stef}}, {{ircnick|cjming|Clare}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-04-14 14:00 SF
|length=1
|window=Readers deployment window
|who=Readers
|what=NOTE: often skipped, the reader teams do not typically check IRC so assume this is not being used if 5 minutes past the start
}}
{{Deployment calendar event card
|when=2026-04-14 23:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC early)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
==={{Deployment_day|date=2026-04-15}}===
{{Deployment calendar event card
|when=2026-04-15 00:00 SF
|length=1
|window=[[Backport windows|UTC morning backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Amir1|Amir}}, {{ircnick|Urbanecm|Martin}}, {{ircnick|awight|Adam}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-04-15 03:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC mid-day)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-04-15 04:00 SF
|length=1
|window=[[mw:Services|Services]] – [[Citoid]] / [[Zotero]]
|who=Marielle ({{ircnick|mvolz}})
|what=See [[mw:Citoid|Citoid]]
}}
{{Deployment calendar event card
|when=2026-04-15 06:00 SF
|length=1
|window=[[Backport windows|UTC afternoon backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Lucas_WMDE|Lucas}}, {{ircnick|Urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-04-15 07:00 SF
|length=1
|window=Wikifunctions Services UTC Afternoon
|who=Abstract Wikipedia team (Africa, Europe, Eastern Americas)
|what=Wikifunctions back-end k8s services
}}
{{Deployment calendar event card
|when=2026-04-15 07:30 SF
|length=0.5
|window=Test Kitchen Experiment Deployment Window
|who=Test Kitchen
|what=Automatic start/stop of active experiments and instruments managed by [[Test Kitchen]].
}}
{{Deployment calendar event card
|when=2026-04-15 10:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC late)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-04-15 11:00 SF
|length=2
|window=MediaWiki train - Utc-7 Version
|who={{ircnick|dduvall|Dan}}, {{ircnick|dancy|Ahmon}}
|what=[[mw:MediaWiki 1.46/Roadmap#Schedule for the deployments|1.46 schedule]]
{{DeployOneWeekMini|1.46.0-wmf.24|1.46.0-wmf.23->1.46.0-wmf.24|1.46.0-wmf.23}}
* group1 to [[mw:MediaWiki_1.46/wmf.24|1.46.0-wmf.24]]
* '''Blockers: {{phabricator|T420482}}'''
}}
{{Deployment calendar event card
|when=2026-04-15 13:00 SF
|length=1
|window=[[Backport windows|UTC late backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|RoanKattouw|Roan}}, {{ircnick|Urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}, {{ircnick|kindrobot|Stef}}, {{ircnick|cjming|Clare}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-04-15 14:00 SF
|length=1
|window=Wikifunctions Services UTC Late
|who=Abstract Wikipedia team (North and South America)
|what=Wikifunctions back-end k8s services
}}
{{Deployment calendar event card
|when=2026-04-15 15:00 SF
|length=1
|window=Readers deployment window
|who=Readers
|what=NOTE: often skipped, the reader teams do not typically check IRC so assume this is not being used if 5 minutes past the start
}}
{{Deployment calendar event card
|when=2026-04-15 23:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC early)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-04-15 23:00 SF
|length=0.5
|window=Primary database switchover
|who={{ircnick|marostegui|Manuel Arostegui}}, {{ircnick|Amir1|Amir}}, {{ircnick|federico3|Federico Ceratto}}
|what=Held deployment window for database primary masters maintenance
}}
==={{Deployment_day|date=2026-04-16}}===
{{Deployment calendar event card
|when=2026-04-16 00:00 SF
|length=1
|window=[[Backport windows|UTC morning backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Amir1|Amir}}, {{ircnick|Urbanecm|Martin}}, {{ircnick|awight|Adam}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-04-16 03:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC mid-day)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-04-16 05:00 SF
|length=1
|window=Mobileapps/RESTBase/Wikifeeds
|who=Content Transform Team
|what=Content transform team node services (mobileapps/wikifeeds)
}}
{{Deployment calendar event card
|when=2026-04-16 06:00 SF
|length=1
|window=[[Backport windows|UTC afternoon backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Lucas_WMDE|Lucas}}, {{ircnick|Urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-04-16 07:30 SF
|length=0.5
|window=Test Kitchen Experiment Deployment Window
|who=Test Kitchen
|what=Automatic start/stop of active experiments and instruments managed by [[Test Kitchen]].
}}
{{Deployment calendar event card
|when=2026-04-16 08:00 SF
|length=1
|window=Train log triage
|who={{ircnick|dduvall|Dan}}, {{ircnick|dancy|Ahmon}}
|what=See [[Heterogeneous deployment/Train deploys#Breakage]]
}}
{{Deployment calendar event card
|when=2026-04-16 09:00 SF
|length=1
|window=[[Puppet request window]]<br/><small>'''(Max 6 patches)'''</small>
|who={{ircnick|jhathaway|JHathaway}}, {{ircnick|rzl|Reuven}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to Puppet change''
}}
{{Deployment calendar event card
|when=2026-04-16 10:00 SF
|length=1
|window=Cloud Services/Technical Documentation weekly deploy (Toolhub, Developer portal, Striker)
|who={{ircnick|bd808}}
|what=...
}}
{{Deployment calendar event card
|when=2026-04-16 10:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC late)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-04-16 11:00 SF
|length=2
|window=MediaWiki train - Utc-7 Version
|who={{ircnick|dduvall|Dan}}, {{ircnick|dancy|Ahmon}}
|what=[[mw:MediaWiki 1.46/Roadmap#Schedule for the deployments|1.46 schedule]]
{{DeployOneWeekMini|1.46.0-wmf.24|1.46.0-wmf.24|1.46.0-wmf.23->1.46.0-wmf.24}}
* group2 to [[mw:MediaWiki_1.46/wmf.24|1.46.0-wmf.24]]
* '''Blockers: {{phabricator|T420482}}'''
}}
{{Deployment calendar event card
|when=2026-04-16 13:00 SF
|length=1
|window=[[Backport windows|UTC late backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|RoanKattouw|Roan}}, {{ircnick|Urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}, {{ircnick|kindrobot|Stef}}, {{ircnick|cjming|Clare}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-04-16 14:00 SF
|length=1
|window=Readers deployment window
|who=Readers
|what=NOTE: often skipped, the reader teams do not typically check IRC so assume this is not being used if 5 minutes past the start
}}
{{Deployment calendar event card
|when=2026-04-16 23:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC early)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
==={{Deployment_day|date=2026-04-17}}===
{{Deployment calendar event card
|when=2026-04-17 00:00 SF
|length=24
|window=No deploys all day! See [[Deployments/Emergencies]] if things are broken.
|who=
|what=No Deploys
}}
{{Deployment calendar event card
|when=2026-04-17 04:00 SF
|length=0.5
|window=GitLab version upgrades
|who={{ircnick|jelto|Jelto}}, {{ircnick|arnoldokoth|Arnold}}, {{ircnick|mutante|Daniel}}, {{ircnick|arnaudb|Arnaud}}
|what=GitLab version upgrades
}}
==={{Deployment_day|date=2026-04-18}}===
{{Deployment calendar event card
|when=2026-04-18 00:00 SF
|length=24
|window=No deploys all day! See [[Deployments/Emergencies]] if things are broken.
|who=
|what=No Deploys
}}
i4ibi3o2lw60qthzshr98rjwr8qi4uh
Server Admin Log
0
7919
2398865
2398857
2026-04-06T02:00:37Z
Stashbot
7414
mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
2398865
wikitext
text/x-wiki
== 2026-04-06 ==
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
g7hjhuxn6iirjym3tlpjyvh3fcwxqlp
2398866
2398865
2026-04-06T02:06:53Z
Stashbot
7414
mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
2398866
wikitext
text/x-wiki
== 2026-04-06 ==
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
jl91yfs7maps7l5t7c2cjptvw2hy77h
2398867
2398866
2026-04-06T05:51:25Z
Stashbot
7414
marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
2398867
wikitext
text/x-wiki
== 2026-04-06 ==
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
ifg2v43ib08ltuhbuun355jm295f1gs
2398869
2398867
2026-04-06T07:43:29Z
Stashbot
7414
kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631|Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
2398869
wikitext
text/x-wiki
== 2026-04-06 ==
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
hdfbafq60xqu77jx8z3hcko8suvnwdv
2398870
2398869
2026-04-06T07:59:36Z
Stashbot
7414
kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631|Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
2398870
wikitext
text/x-wiki
== 2026-04-06 ==
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
2exw5dripec5ea224cg4db6p8nx5k4k
2398871
2398870
2026-04-06T08:02:50Z
Stashbot
7414
kgraessle@deploy1003: kgraessle: Continuing with sync
2398871
wikitext
text/x-wiki
== 2026-04-06 ==
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
pzr816k7t0rz0n9sh4t3ds18f174r8a
2398873
2398871
2026-04-06T08:15:24Z
Stashbot
7414
kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631|Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
2398873
wikitext
text/x-wiki
== 2026-04-06 ==
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
pxkysqw7e43hohks4ambqf0ywhcgtb1
2398874
2398873
2026-04-06T08:35:44Z
Stashbot
7414
urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061|[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
2398874
wikitext
text/x-wiki
== 2026-04-06 ==
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
s1ajuzgz2jz33rj9mi385zupko9mgn7
2398875
2398874
2026-04-06T08:37:21Z
Stashbot
7414
urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061|[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
2398875
wikitext
text/x-wiki
== 2026-04-06 ==
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
glw5i5bk9schoas5iy7hw4taipi24x2
2398876
2398875
2026-04-06T08:40:02Z
Stashbot
7414
urbanecm@deploy1003: urbanecm: Continuing with sync
2398876
wikitext
text/x-wiki
== 2026-04-06 ==
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
qw9oz69qrxdtqnmj61a6w1c2opmeb4m
2398877
2398876
2026-04-06T08:46:34Z
Stashbot
7414
urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061|[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
2398877
wikitext
text/x-wiki
== 2026-04-06 ==
* 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
si4l6dhbext73n7rzuatzhvl482bhd7
2398878
2398877
2026-04-06T08:48:40Z
Stashbot
7414
marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
2398878
wikitext
text/x-wiki
== 2026-04-06 ==
* 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
* 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
q48ci4lnzp5edkwniqa7sx7uchz002p
2398879
2398878
2026-04-06T08:55:51Z
Stashbot
7414
urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268196|SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197|[i18n] Correct the action message (T420154)]], [[gerrit:1268198|refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199|Create a new grant for the echo-read-notifications (T420154)]]
2398879
wikitext
text/x-wiki
== 2026-04-06 ==
* 08:55 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]]
* 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
* 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
irm7ywnsw2wmzwo23l3ebqe1nq78xdf
2398880
2398879
2026-04-06T08:59:49Z
Stashbot
7414
marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: test
2398880
wikitext
text/x-wiki
== 2026-04-06 ==
* 08:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: test
* 08:55 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]]
* 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
* 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
3icboiu7glnp30leag9m62rxcbcey1h
2398881
2398880
2026-04-06T09:00:18Z
Stashbot
7414
marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: test
2398881
wikitext
text/x-wiki
== 2026-04-06 ==
* 09:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: test
* 08:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: test
* 08:55 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]]
* 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
* 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
c7eowhrk18lsqhf6k9r13106e8ve64t
2398882
2398881
2026-04-06T09:00:41Z
Stashbot
7414
marostegui@cumin1003: dbctl commit (dc=all): 'Pool db1169', diff saved to https://phabricator.wikimedia.org/P90257 and previous config saved to /var/cache/conftool/dbconfig/20260406-090040-marostegui.json
2398882
wikitext
text/x-wiki
== 2026-04-06 ==
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Pool db1169', diff saved to https://phabricator.wikimedia.org/P90257 and previous config saved to /var/cache/conftool/dbconfig/20260406-090040-marostegui.json
* 09:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: test
* 08:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: test
* 08:55 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]]
* 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
* 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
ohgbtsh78405h9dy3dwdclyojq2by40
2398883
2398882
2026-04-06T09:01:15Z
Stashbot
7414
marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Upgrade
2398883
wikitext
text/x-wiki
== 2026-04-06 ==
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Upgrade
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Pool db1169', diff saved to https://phabricator.wikimedia.org/P90257 and previous config saved to /var/cache/conftool/dbconfig/20260406-090040-marostegui.json
* 09:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: test
* 08:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: test
* 08:55 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]]
* 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
* 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
3uwj90s9jo6qf1gpp7meg2hymp3no7k
2398884
2398883
2026-04-06T09:01:16Z
Stashbot
7414
marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
2398884
wikitext
text/x-wiki
== 2026-04-06 ==
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Upgrade
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Pool db1169', diff saved to https://phabricator.wikimedia.org/P90257 and previous config saved to /var/cache/conftool/dbconfig/20260406-090040-marostegui.json
* 09:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: test
* 08:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: test
* 08:55 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]]
* 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
* 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
o1cb63jxv8szznc3osqoljimttuz5vz
2398885
2398884
2026-04-06T09:01:25Z
Stashbot
7414
marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
2398885
wikitext
text/x-wiki
== 2026-04-06 ==
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Upgrade
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Pool db1169', diff saved to https://phabricator.wikimedia.org/P90257 and previous config saved to /var/cache/conftool/dbconfig/20260406-090040-marostegui.json
* 09:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: test
* 08:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: test
* 08:55 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]]
* 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
* 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
dfks1tkekrxw3cpgq95zssfakq6mwse
2398886
2398885
2026-04-06T09:01:26Z
Stashbot
7414
marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Upgrade
2398886
wikitext
text/x-wiki
== 2026-04-06 ==
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Upgrade
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Upgrade
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Pool db1169', diff saved to https://phabricator.wikimedia.org/P90257 and previous config saved to /var/cache/conftool/dbconfig/20260406-090040-marostegui.json
* 09:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: test
* 08:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: test
* 08:55 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]]
* 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
* 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
2g8zzig4p3hd25377s5zx89z4unwwsp
2398887
2398886
2026-04-06T09:10:42Z
Stashbot
7414
marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
2398887
wikitext
text/x-wiki
== 2026-04-06 ==
* 09:10 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Upgrade
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Upgrade
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Pool db1169', diff saved to https://phabricator.wikimedia.org/P90257 and previous config saved to /var/cache/conftool/dbconfig/20260406-090040-marostegui.json
* 09:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: test
* 08:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: test
* 08:55 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]]
* 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
* 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
6z83ifcvuwxk7fynm7yjtdq54960k2s
2398888
2398887
2026-04-06T09:10:49Z
Stashbot
7414
marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
2398888
wikitext
text/x-wiki
== 2026-04-06 ==
* 09:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:10 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Upgrade
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Upgrade
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Pool db1169', diff saved to https://phabricator.wikimedia.org/P90257 and previous config saved to /var/cache/conftool/dbconfig/20260406-090040-marostegui.json
* 09:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: test
* 08:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: test
* 08:55 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]]
* 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
* 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
fyts1c6ibujizwdstyq2zjgj88qvv6n
2398889
2398888
2026-04-06T09:15:16Z
Stashbot
7414
urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268196|SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197|[i18n] Correct the action message (T420154)]], [[gerrit:1268198|refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199|Create a new grant for the echo-read-notifications (T420154)]] synced to the testservers (see https://wikitech.wikimedia
2398889
wikitext
text/x-wiki
== 2026-04-06 ==
* 09:15 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] synced to the testservers (see https://wikitech.wikimedia
* 09:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:10 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Upgrade
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Upgrade
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Pool db1169', diff saved to https://phabricator.wikimedia.org/P90257 and previous config saved to /var/cache/conftool/dbconfig/20260406-090040-marostegui.json
* 09:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: test
* 08:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: test
* 08:55 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]]
* 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
* 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
26p9lyrbo3m14t9wj0f26uvurdmmdlr
2398890
2398889
2026-04-06T09:15:42Z
Stashbot
7414
urbanecm@deploy1003: urbanecm: Continuing with sync
2398890
wikitext
text/x-wiki
== 2026-04-06 ==
* 09:15 urbanecm@deploy1003: urbanecm: Continuing with sync
* 09:15 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] synced to the testservers (see https://wikitech.wikimedia
* 09:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:10 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Upgrade
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Upgrade
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Pool db1169', diff saved to https://phabricator.wikimedia.org/P90257 and previous config saved to /var/cache/conftool/dbconfig/20260406-090040-marostegui.json
* 09:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: test
* 08:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: test
* 08:55 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]]
* 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
* 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
2lgmchyh6tyt3tkp3qzqiiiuiglky5u
2398891
2398890
2026-04-06T09:27:09Z
Stashbot
7414
marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2142.codfw.wmnet,db1152.eqiad.wmnet with reason: Upgrade
2398891
wikitext
text/x-wiki
== 2026-04-06 ==
* 09:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2142.codfw.wmnet,db1152.eqiad.wmnet with reason: Upgrade
* 09:15 urbanecm@deploy1003: urbanecm: Continuing with sync
* 09:15 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] synced to the testservers (see https://wikitech.wikimedia
* 09:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:10 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Upgrade
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Upgrade
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Pool db1169', diff saved to https://phabricator.wikimedia.org/P90257 and previous config saved to /var/cache/conftool/dbconfig/20260406-090040-marostegui.json
* 09:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: test
* 08:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: test
* 08:55 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]]
* 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
* 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
f5qz7tg3cku85r681zhd05b9svsejzt
2398892
2398891
2026-04-06T09:27:38Z
Stashbot
7414
urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268196|SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197|[i18n] Correct the action message (T420154)]], [[gerrit:1268198|refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199|Create a new grant for the echo-read-notifications (T420154)]] (duration: 31m 47s)
2398892
wikitext
text/x-wiki
== 2026-04-06 ==
* 09:27 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] (duration: 31m 47s)
* 09:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2142.codfw.wmnet,db1152.eqiad.wmnet with reason: Upgrade
* 09:15 urbanecm@deploy1003: urbanecm: Continuing with sync
* 09:15 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] synced to the testservers (see https://wikitech.wikimedia
* 09:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:10 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Upgrade
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Upgrade
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Pool db1169', diff saved to https://phabricator.wikimedia.org/P90257 and previous config saved to /var/cache/conftool/dbconfig/20260406-090040-marostegui.json
* 09:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: test
* 08:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: test
* 08:55 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]]
* 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
* 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
ioxeosoxhcno6lp8hlelme9tsbjjbra
2398893
2398892
2026-04-06T09:30:43Z
Stashbot
7414
marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2142.codfw.wmnet with OS trixie
2398893
wikitext
text/x-wiki
== 2026-04-06 ==
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2142.codfw.wmnet with OS trixie
* 09:27 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] (duration: 31m 47s)
* 09:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2142.codfw.wmnet,db1152.eqiad.wmnet with reason: Upgrade
* 09:15 urbanecm@deploy1003: urbanecm: Continuing with sync
* 09:15 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] synced to the testservers (see https://wikitech.wikimedia
* 09:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:10 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Upgrade
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Upgrade
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Pool db1169', diff saved to https://phabricator.wikimedia.org/P90257 and previous config saved to /var/cache/conftool/dbconfig/20260406-090040-marostegui.json
* 09:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: test
* 08:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: test
* 08:55 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]]
* 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
* 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
q6bne23l2gamg7f5es5xlbe0q844tf4
2398894
2398893
2026-04-06T09:48:27Z
Stashbot
7414
marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
2398894
wikitext
text/x-wiki
== 2026-04-06 ==
* 09:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2142.codfw.wmnet with OS trixie
* 09:27 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] (duration: 31m 47s)
* 09:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2142.codfw.wmnet,db1152.eqiad.wmnet with reason: Upgrade
* 09:15 urbanecm@deploy1003: urbanecm: Continuing with sync
* 09:15 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] synced to the testservers (see https://wikitech.wikimedia
* 09:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:10 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Upgrade
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Upgrade
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Pool db1169', diff saved to https://phabricator.wikimedia.org/P90257 and previous config saved to /var/cache/conftool/dbconfig/20260406-090040-marostegui.json
* 09:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: test
* 08:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: test
* 08:55 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]]
* 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
* 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
9kdey3aqfzxtosg3hceggymf7334r3j
2398895
2398894
2026-04-06T09:54:25Z
Stashbot
7414
marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
2398895
wikitext
text/x-wiki
== 2026-04-06 ==
* 09:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
* 09:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2142.codfw.wmnet with OS trixie
* 09:27 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] (duration: 31m 47s)
* 09:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2142.codfw.wmnet,db1152.eqiad.wmnet with reason: Upgrade
* 09:15 urbanecm@deploy1003: urbanecm: Continuing with sync
* 09:15 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] synced to the testservers (see https://wikitech.wikimedia
* 09:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:10 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Upgrade
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Upgrade
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Pool db1169', diff saved to https://phabricator.wikimedia.org/P90257 and previous config saved to /var/cache/conftool/dbconfig/20260406-090040-marostegui.json
* 09:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: test
* 08:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: test
* 08:55 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]]
* 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
* 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
rtunokdd6xmwn88cqzlcua6phu1bmsw
2398896
2398895
2026-04-06T10:12:21Z
Stashbot
7414
marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2142.codfw.wmnet with OS trixie
2398896
wikitext
text/x-wiki
== 2026-04-06 ==
* 10:12 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2142.codfw.wmnet with OS trixie
* 09:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
* 09:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2142.codfw.wmnet with OS trixie
* 09:27 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] (duration: 31m 47s)
* 09:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2142.codfw.wmnet,db1152.eqiad.wmnet with reason: Upgrade
* 09:15 urbanecm@deploy1003: urbanecm: Continuing with sync
* 09:15 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] synced to the testservers (see https://wikitech.wikimedia
* 09:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:10 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Upgrade
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Upgrade
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Pool db1169', diff saved to https://phabricator.wikimedia.org/P90257 and previous config saved to /var/cache/conftool/dbconfig/20260406-090040-marostegui.json
* 09:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: test
* 08:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: test
* 08:55 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]]
* 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
* 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
7gsfwab2qx37yiiypy31qau6p03fx1f
2398897
2398896
2026-04-06T10:15:20Z
Stashbot
7414
marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
2398897
wikitext
text/x-wiki
== 2026-04-06 ==
* 10:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 10:12 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2142.codfw.wmnet with OS trixie
* 09:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
* 09:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2142.codfw.wmnet with OS trixie
* 09:27 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] (duration: 31m 47s)
* 09:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2142.codfw.wmnet,db1152.eqiad.wmnet with reason: Upgrade
* 09:15 urbanecm@deploy1003: urbanecm: Continuing with sync
* 09:15 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] synced to the testservers (see https://wikitech.wikimedia
* 09:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:10 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Upgrade
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Upgrade
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Pool db1169', diff saved to https://phabricator.wikimedia.org/P90257 and previous config saved to /var/cache/conftool/dbconfig/20260406-090040-marostegui.json
* 09:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: test
* 08:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: test
* 08:55 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]]
* 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
* 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
9uj4wqgel3rd5buqnq3hzz3fsxqj91k
2398898
2398897
2026-04-06T10:15:35Z
Stashbot
7414
marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
2398898
wikitext
text/x-wiki
== 2026-04-06 ==
* 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 10:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 10:12 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2142.codfw.wmnet with OS trixie
* 09:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
* 09:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2142.codfw.wmnet with OS trixie
* 09:27 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] (duration: 31m 47s)
* 09:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2142.codfw.wmnet,db1152.eqiad.wmnet with reason: Upgrade
* 09:15 urbanecm@deploy1003: urbanecm: Continuing with sync
* 09:15 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] synced to the testservers (see https://wikitech.wikimedia
* 09:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:10 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Upgrade
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Upgrade
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Pool db1169', diff saved to https://phabricator.wikimedia.org/P90257 and previous config saved to /var/cache/conftool/dbconfig/20260406-090040-marostegui.json
* 09:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: test
* 08:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: test
* 08:55 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]]
* 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
* 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
ev0rjuroo9226l54umzf68p9lg3b0hh
2398900
2398898
2026-04-06T11:14:16Z
Stashbot
7414
marostegui@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
2398900
wikitext
text/x-wiki
== 2026-04-06 ==
* 11:14 marostegui@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 10:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 10:12 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2142.codfw.wmnet with OS trixie
* 09:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
* 09:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2142.codfw.wmnet with OS trixie
* 09:27 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] (duration: 31m 47s)
* 09:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2142.codfw.wmnet,db1152.eqiad.wmnet with reason: Upgrade
* 09:15 urbanecm@deploy1003: urbanecm: Continuing with sync
* 09:15 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] synced to the testservers (see https://wikitech.wikimedia
* 09:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:10 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Upgrade
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Upgrade
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Pool db1169', diff saved to https://phabricator.wikimedia.org/P90257 and previous config saved to /var/cache/conftool/dbconfig/20260406-090040-marostegui.json
* 09:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: test
* 08:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: test
* 08:55 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]]
* 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
* 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
k40dheecacgqt2qz3um6buksjqxx9w5
2398901
2398900
2026-04-06T11:24:24Z
Stashbot
7414
marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
2398901
wikitext
text/x-wiki
== 2026-04-06 ==
* 11:24 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 11:14 marostegui@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 10:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 10:12 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2142.codfw.wmnet with OS trixie
* 09:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
* 09:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2142.codfw.wmnet with OS trixie
* 09:27 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] (duration: 31m 47s)
* 09:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2142.codfw.wmnet,db1152.eqiad.wmnet with reason: Upgrade
* 09:15 urbanecm@deploy1003: urbanecm: Continuing with sync
* 09:15 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] synced to the testservers (see https://wikitech.wikimedia
* 09:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:10 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Upgrade
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Upgrade
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Pool db1169', diff saved to https://phabricator.wikimedia.org/P90257 and previous config saved to /var/cache/conftool/dbconfig/20260406-090040-marostegui.json
* 09:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: test
* 08:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: test
* 08:55 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]]
* 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
* 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
frua222mz98t5ig0cacm2zck3mua2b6
2398902
2398901
2026-04-06T11:24:34Z
Stashbot
7414
marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
2398902
wikitext
text/x-wiki
== 2026-04-06 ==
* 11:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 11:24 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 11:14 marostegui@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 10:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 10:12 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2142.codfw.wmnet with OS trixie
* 09:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
* 09:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2142.codfw.wmnet with OS trixie
* 09:27 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] (duration: 31m 47s)
* 09:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2142.codfw.wmnet,db1152.eqiad.wmnet with reason: Upgrade
* 09:15 urbanecm@deploy1003: urbanecm: Continuing with sync
* 09:15 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] synced to the testservers (see https://wikitech.wikimedia
* 09:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:10 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Upgrade
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Upgrade
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Pool db1169', diff saved to https://phabricator.wikimedia.org/P90257 and previous config saved to /var/cache/conftool/dbconfig/20260406-090040-marostegui.json
* 09:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: test
* 08:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: test
* 08:55 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]]
* 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
* 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
cc9p1ckwnigjx2upvbuhi5s3drfrw6b
2398903
2398902
2026-04-06T11:26:21Z
Stashbot
7414
marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on pc2015.codfw.wmnet with reason: Maintenance
2398903
wikitext
text/x-wiki
== 2026-04-06 ==
* 11:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on pc2015.codfw.wmnet with reason: Maintenance
* 11:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 11:24 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 11:14 marostegui@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 10:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 10:12 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2142.codfw.wmnet with OS trixie
* 09:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
* 09:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2142.codfw.wmnet with OS trixie
* 09:27 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] (duration: 31m 47s)
* 09:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2142.codfw.wmnet,db1152.eqiad.wmnet with reason: Upgrade
* 09:15 urbanecm@deploy1003: urbanecm: Continuing with sync
* 09:15 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] synced to the testservers (see https://wikitech.wikimedia
* 09:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:10 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Upgrade
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Upgrade
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Pool db1169', diff saved to https://phabricator.wikimedia.org/P90257 and previous config saved to /var/cache/conftool/dbconfig/20260406-090040-marostegui.json
* 09:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: test
* 08:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: test
* 08:55 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]]
* 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
* 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
easfk4flvj5v66guwus6newyc1p25tx
2398904
2398903
2026-04-06T11:26:43Z
Stashbot
7414
marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on pc2015.codfw.wmnet,pc1015.eqiad.wmnet with reason: Maintenance
2398904
wikitext
text/x-wiki
== 2026-04-06 ==
* 11:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on pc2015.codfw.wmnet,pc1015.eqiad.wmnet with reason: Maintenance
* 11:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on pc2015.codfw.wmnet with reason: Maintenance
* 11:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 11:24 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 11:14 marostegui@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 10:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 10:12 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2142.codfw.wmnet with OS trixie
* 09:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
* 09:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2142.codfw.wmnet with OS trixie
* 09:27 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] (duration: 31m 47s)
* 09:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2142.codfw.wmnet,db1152.eqiad.wmnet with reason: Upgrade
* 09:15 urbanecm@deploy1003: urbanecm: Continuing with sync
* 09:15 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] synced to the testservers (see https://wikitech.wikimedia
* 09:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:10 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Upgrade
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Upgrade
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Pool db1169', diff saved to https://phabricator.wikimedia.org/P90257 and previous config saved to /var/cache/conftool/dbconfig/20260406-090040-marostegui.json
* 09:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: test
* 08:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: test
* 08:55 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]]
* 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
* 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
ovo82u9lf4e1h9adfhgjjudjy4n9dki
2398905
2398904
2026-04-06T11:28:33Z
Stashbot
7414
marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host pc1015.eqiad.wmnet with OS trixie
2398905
wikitext
text/x-wiki
== 2026-04-06 ==
* 11:28 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host pc1015.eqiad.wmnet with OS trixie
* 11:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on pc2015.codfw.wmnet,pc1015.eqiad.wmnet with reason: Maintenance
* 11:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on pc2015.codfw.wmnet with reason: Maintenance
* 11:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 11:24 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 11:14 marostegui@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 10:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 10:12 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2142.codfw.wmnet with OS trixie
* 09:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
* 09:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2142.codfw.wmnet with OS trixie
* 09:27 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] (duration: 31m 47s)
* 09:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2142.codfw.wmnet,db1152.eqiad.wmnet with reason: Upgrade
* 09:15 urbanecm@deploy1003: urbanecm: Continuing with sync
* 09:15 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] synced to the testservers (see https://wikitech.wikimedia
* 09:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:10 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Upgrade
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Upgrade
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Pool db1169', diff saved to https://phabricator.wikimedia.org/P90257 and previous config saved to /var/cache/conftool/dbconfig/20260406-090040-marostegui.json
* 09:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: test
* 08:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: test
* 08:55 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]]
* 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
* 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
bi8zx8vhtkee7njpolh0b92j0tm8hz1
2398906
2398905
2026-04-06T11:29:26Z
Stashbot
7414
marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host pc2015.codfw.wmnet with OS trixie
2398906
wikitext
text/x-wiki
== 2026-04-06 ==
* 11:29 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host pc2015.codfw.wmnet with OS trixie
* 11:28 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host pc1015.eqiad.wmnet with OS trixie
* 11:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on pc2015.codfw.wmnet,pc1015.eqiad.wmnet with reason: Maintenance
* 11:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on pc2015.codfw.wmnet with reason: Maintenance
* 11:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 11:24 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 11:14 marostegui@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 10:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 10:12 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2142.codfw.wmnet with OS trixie
* 09:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
* 09:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2142.codfw.wmnet with OS trixie
* 09:27 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] (duration: 31m 47s)
* 09:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2142.codfw.wmnet,db1152.eqiad.wmnet with reason: Upgrade
* 09:15 urbanecm@deploy1003: urbanecm: Continuing with sync
* 09:15 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] synced to the testservers (see https://wikitech.wikimedia
* 09:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:10 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Upgrade
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Upgrade
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Pool db1169', diff saved to https://phabricator.wikimedia.org/P90257 and previous config saved to /var/cache/conftool/dbconfig/20260406-090040-marostegui.json
* 09:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: test
* 08:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: test
* 08:55 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]]
* 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
* 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
6true0gitdmrqm3rb63xrmkhr5tduvz
2398907
2398906
2026-04-06T11:43:20Z
Stashbot
7414
marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1015.eqiad.wmnet with reason: host reimage
2398907
wikitext
text/x-wiki
== 2026-04-06 ==
* 11:43 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1015.eqiad.wmnet with reason: host reimage
* 11:29 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host pc2015.codfw.wmnet with OS trixie
* 11:28 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host pc1015.eqiad.wmnet with OS trixie
* 11:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on pc2015.codfw.wmnet,pc1015.eqiad.wmnet with reason: Maintenance
* 11:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on pc2015.codfw.wmnet with reason: Maintenance
* 11:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 11:24 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 11:14 marostegui@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 10:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 10:12 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2142.codfw.wmnet with OS trixie
* 09:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
* 09:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2142.codfw.wmnet with OS trixie
* 09:27 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] (duration: 31m 47s)
* 09:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2142.codfw.wmnet,db1152.eqiad.wmnet with reason: Upgrade
* 09:15 urbanecm@deploy1003: urbanecm: Continuing with sync
* 09:15 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] synced to the testservers (see https://wikitech.wikimedia
* 09:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:10 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Upgrade
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Upgrade
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Pool db1169', diff saved to https://phabricator.wikimedia.org/P90257 and previous config saved to /var/cache/conftool/dbconfig/20260406-090040-marostegui.json
* 09:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: test
* 08:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: test
* 08:55 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]]
* 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
* 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
r2bue3c17sjq49e94wuctby3vxl21hx
2398908
2398907
2026-04-06T11:48:03Z
Stashbot
7414
marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2015.codfw.wmnet with reason: host reimage
2398908
wikitext
text/x-wiki
== 2026-04-06 ==
* 11:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2015.codfw.wmnet with reason: host reimage
* 11:43 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1015.eqiad.wmnet with reason: host reimage
* 11:29 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host pc2015.codfw.wmnet with OS trixie
* 11:28 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host pc1015.eqiad.wmnet with OS trixie
* 11:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on pc2015.codfw.wmnet,pc1015.eqiad.wmnet with reason: Maintenance
* 11:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on pc2015.codfw.wmnet with reason: Maintenance
* 11:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 11:24 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 11:14 marostegui@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 10:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 10:12 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2142.codfw.wmnet with OS trixie
* 09:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
* 09:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2142.codfw.wmnet with OS trixie
* 09:27 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] (duration: 31m 47s)
* 09:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2142.codfw.wmnet,db1152.eqiad.wmnet with reason: Upgrade
* 09:15 urbanecm@deploy1003: urbanecm: Continuing with sync
* 09:15 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] synced to the testservers (see https://wikitech.wikimedia
* 09:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:10 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Upgrade
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Upgrade
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Pool db1169', diff saved to https://phabricator.wikimedia.org/P90257 and previous config saved to /var/cache/conftool/dbconfig/20260406-090040-marostegui.json
* 09:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: test
* 08:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: test
* 08:55 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]]
* 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
* 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
42xgm62qr34n5vdva6oetiox9s7ehim
2398909
2398908
2026-04-06T11:49:47Z
Stashbot
7414
marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1015.eqiad.wmnet with reason: host reimage
2398909
wikitext
text/x-wiki
== 2026-04-06 ==
* 11:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1015.eqiad.wmnet with reason: host reimage
* 11:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2015.codfw.wmnet with reason: host reimage
* 11:43 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1015.eqiad.wmnet with reason: host reimage
* 11:29 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host pc2015.codfw.wmnet with OS trixie
* 11:28 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host pc1015.eqiad.wmnet with OS trixie
* 11:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on pc2015.codfw.wmnet,pc1015.eqiad.wmnet with reason: Maintenance
* 11:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on pc2015.codfw.wmnet with reason: Maintenance
* 11:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 11:24 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 11:14 marostegui@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 10:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 10:12 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2142.codfw.wmnet with OS trixie
* 09:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
* 09:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2142.codfw.wmnet with OS trixie
* 09:27 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] (duration: 31m 47s)
* 09:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2142.codfw.wmnet,db1152.eqiad.wmnet with reason: Upgrade
* 09:15 urbanecm@deploy1003: urbanecm: Continuing with sync
* 09:15 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] synced to the testservers (see https://wikitech.wikimedia
* 09:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:10 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Upgrade
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Upgrade
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Pool db1169', diff saved to https://phabricator.wikimedia.org/P90257 and previous config saved to /var/cache/conftool/dbconfig/20260406-090040-marostegui.json
* 09:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: test
* 08:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: test
* 08:55 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]]
* 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
* 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
fhgyrh38f9mexhb6z21k3ixtmzimrov
2398910
2398909
2026-04-06T11:53:35Z
Stashbot
7414
marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc2015.codfw.wmnet with reason: host reimage
2398910
wikitext
text/x-wiki
== 2026-04-06 ==
* 11:53 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc2015.codfw.wmnet with reason: host reimage
* 11:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1015.eqiad.wmnet with reason: host reimage
* 11:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2015.codfw.wmnet with reason: host reimage
* 11:43 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1015.eqiad.wmnet with reason: host reimage
* 11:29 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host pc2015.codfw.wmnet with OS trixie
* 11:28 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host pc1015.eqiad.wmnet with OS trixie
* 11:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on pc2015.codfw.wmnet,pc1015.eqiad.wmnet with reason: Maintenance
* 11:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on pc2015.codfw.wmnet with reason: Maintenance
* 11:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 11:24 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 11:14 marostegui@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 10:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 10:12 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2142.codfw.wmnet with OS trixie
* 09:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
* 09:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2142.codfw.wmnet with OS trixie
* 09:27 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] (duration: 31m 47s)
* 09:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2142.codfw.wmnet,db1152.eqiad.wmnet with reason: Upgrade
* 09:15 urbanecm@deploy1003: urbanecm: Continuing with sync
* 09:15 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]] synced to the testservers (see https://wikitech.wikimedia
* 09:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:10 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Upgrade
* 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Upgrade
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Pool db1169', diff saved to https://phabricator.wikimedia.org/P90257 and previous config saved to /var/cache/conftool/dbconfig/20260406-090040-marostegui.json
* 09:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: test
* 08:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: test
* 08:55 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268196{{!}}SECURITY: Protect ApiEchoNotifications with a new user right (T420154)]], [[gerrit:1268197{{!}}[i18n] Correct the action message (T420154)]], [[gerrit:1268198{{!}}refactor: Use a trait to check for reading permissions (T420154)]], [[gerrit:1268199{{!}}Create a new grant for the echo-read-notifications (T420154)]]
* 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
* 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] (duration: 10m 50s)
* 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
* 08:37 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1268061{{!}}[Growth] Decrease user impact limits back to the defaults (T422288 T341599)]]
* 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] (duration: 31m 54s)
* 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
* 07:59 kgraessle@deploy1003: kgraessle: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for [[gerrit:1264631{{!}}Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)]]
* 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-05 ==
* 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-04 ==
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-04-03 ==
* 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
* 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
* 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
* 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
* 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
* 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
* 14:52 sbassett: Deployed security mitigation for [[phab:T422244|T422244]]
* 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
* 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
* 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
* 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
* 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
* 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
* 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:54 brouberol@dns1004: END - running authdns-update
* 09:52 brouberol@dns1004: START - running authdns-update
* 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
* 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
* 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
* 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
* 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
* 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
* 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # [[phab:T422062|T422062]]
* 00:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] (duration: 06m 50s)
* 00:53 zabe@deploy1003: zabe: Continuing with sync
* 00:53 zabe@deploy1003: zabe: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:51 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1267286{{!}}Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)]]
== 2026-04-02 ==
* 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:41 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] (duration: 06m 10s)
* 23:37 zabe@deploy1003: zabe: Continuing with sync
* 23:37 zabe@deploy1003: zabe: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:35 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1264110{{!}}Start reading from new file table in dewiki and fawiki (T416548)]]
* 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] (duration: 07m 33s)
* 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
* 22:00 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1267214{{!}}Fix section heading spacing on mobile (T414882)]]
* 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] (duration: 06m 18s)
* 21:28 kemayo@deploy1003: kemayo: Continuing with sync
* 21:28 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:26 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:18 kemayo@deploy1003: kemayo: Continuing with sync
* 21:17 kemayo@deploy1003: kemayo: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:15 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267204{{!}}SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise]]
* 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] (duration: 11m 40s)
* 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
* 20:53 kemayo@deploy1003: annet, kemayo: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1267008{{!}}Add logged-in reader retention instrument (T420490)]]
* 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] (duration: 11m 46s)
* 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
* 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1264569{{!}}zhwikinews: 20th anniversary logo change (T420165)]]
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T418109|T418109]]
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:56 cmooney@dns2005: END - running authdns-update
* 18:55 cmooney@dns2005: START - running authdns-update
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
* 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:18 swfrench@dns1004: END - running authdns-update
* 17:16 swfrench@dns1004: START - running authdns-update
* 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
* 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
* 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] (duration: 29m 56s)
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
* 15:51 swfrench@deploy1003: swfrench: Continuing with sync
* 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
* 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
* 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} (attempt 2) - [[phab:T422143|T422143]]
* 15:32 moritzm: installing freetype security updates
* 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - [[phab:T422166|T422166]]
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]] (duration: 26m 48s)
* 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - [[phab:T422166|T422166]]
* 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:23 papaul: maintenance complete on mr1-eqiad
* 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:11 moritzm: installing apache2 security updates
* 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
* 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:59 papaul: ongoing maintenance on mr1-eqiad
* 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
* 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up {{Gerrit|1267062}}, {{Gerrit|1266985}} - [[phab:T422143|T422143]]
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
* 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
* 14:42 moritzm: installing libxml-parser-perl security updates
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
* 14:28 moritzm: installing pyasn1 security updates
* 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
* 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1267062{{!}}Bump maxConnCount]]
* 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:09 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # [[phab:T421114|T421114]]
* 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
* 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
* 13:58 esanders@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
* 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1266985{{!}}Fix suggestion mode availability check (T422143)]]
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
* 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
* 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
* 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - [[phab:T414486|T414486]]
* 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, [[phab:T414486|T414486]]]
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
* 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
* 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
* 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
* 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
* 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
* 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
* 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
* 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 12:13 volans@dns1004: END - running authdns-update
* 12:11 volans@dns1004: START - running authdns-update
* 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
* 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
* 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
* 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
* 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
* 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
* 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
* 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
* 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
* 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 10:19 moritzm: installing freetype security updates
* 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
* 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
* 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:49 moritzm: added Atsuko to the cn=ops LDAP group [[phab:T421860|T421860]]
* 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
* 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
* 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 08:42 XioNoX: reboot mr1-esams - [[phab:T416450|T416450]]
* 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
* 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] (duration: 10m 13s)
* 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
* 07:55 jmm@dns1004: END - running authdns-update
* 07:54 jmm@dns1004: START - running authdns-update
* 07:52 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1266866{{!}}Disable external link analysis (T419837)]]
* 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] (duration: 06m 39s)
* 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, [[phab:T421714|T421714]]) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
* 07:43 jnuche@deploy1003: jnuche: Continuing with sync
* 07:43 jnuche@deploy1003: jnuche: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:41 jnuche@deploy1003: Started scap sync-world: Backport for [[gerrit:1266861{{!}}ApiAuthManagerHelper: Accept fields with undefined label (T422027)]]
* 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
* 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
* 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
* 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
* 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] (duration: 07m 00s)
* 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
* 07:07 gkyziridis@deploy1003: gkyziridis: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for [[gerrit:1266228{{!}}EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
== 2026-04-01 ==
* 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
* 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - [[phab:T368096|T368096]]
* 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - [[phab:T368096|T368096]]
* 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] (duration: 08m 25s)
* 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
* 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1265482{{!}}Legal Footer Link Deploys (T420348)]]
* 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] (duration: 06m 37s)
* 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
* 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 22:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266443{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]], [[gerrit:1266442{{!}}Deferred: Fix function to get virtual domain (T421914 T398709)]]
* 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
* 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
* 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
* 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] (duration: 07m 15s)
* 21:38 swfrench@deploy1003: swfrench: Continuing with sync
* 21:36 swfrench@deploy1003: swfrench: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:35 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1266406{{!}}Only set the thumb step when width is given (T422074)]], [[gerrit:1266407{{!}}Only set the thumb step when width is given (T422074)]]
* 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
* 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] (duration: 08m 47s)
* 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
* 20:06 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:04 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1266314{{!}}config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)]]
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
* 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
* 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
* 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
* 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
* 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
* 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
* 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
* 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] (duration: 08m 18s)
* 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 17:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1266326{{!}}Refix thumb steps for the poster image of videos (T414805)]], [[gerrit:1266327{{!}}Refix thumb steps for the poster image of videos (T414805)]]
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
* 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
* 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
* 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 01m 53s)
* 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83] (duration: 04m 15s)
* 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] [analytics/refinery@fa28ad83]
* 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
* 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
* 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API [[phab:T415202|T415202]] TEST [analytics/refinery@fa28ad83]
* 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]] (duration: 07m 25s)
* 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - [[phab:T368096|T368096]]
* 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] (duration: 11m 30s)
* 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
* 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1266317{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]], [[gerrit:1266316{{!}}hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)]]
* 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test ([[phab:T421402|T421402]])
* 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] (duration: 09m 31s)
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
* 16:21 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1266309{{!}}Set the default for UserEmailConfirmationUseHTML to true (T411147)]], [[gerrit:1260011{{!}}cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)]]
* 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
* 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
* 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer ([[phab:T421714|T421714]], prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
* 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] (duration: 12m 53s)
* 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:09 jforrester@deploy1003: jforrester: Continuing with sync
* 15:03 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266290{{!}}Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)]]
* 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
* 14:59 taavi@dns1004: END - running authdns-update
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
* 14:57 taavi@dns1004: START - running authdns-update
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
* 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
* 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
* 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
* 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
* 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
* 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:44 fabfur: upgrading ulsfo to haproxy 3.2 ([[phab:T421402|T421402]])
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
* 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
* 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
* 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] (duration: 08m 14s)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
* 14:12 jforrester@deploy1003: jforrester: Continuing with sync
* 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: jforrester: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
* 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1266190{{!}}MemcachedWrapper: Hash key when longer than 250 characters]], [[gerrit:1266219{{!}}Extend queue processing times for abstract fragments (T421581)]]
* 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
* 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
* 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
* 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
* 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
* 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
* 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
* 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
* 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
* 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
* 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
* 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
* 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
* 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
* 13:21 fabfur: upgrading magru to haproxy 3.2 ([[phab:T421402|T421402]])
* 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade ([[phab:T421402|T421402]])
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
* 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
* 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] (duration: 09m 21s)
* 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 12:52 kharlan@deploy1003: kharlan: Continuing with sync
* 12:49 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
* 12:47 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266227{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]], [[gerrit:1266226{{!}}hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)]]
* 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
* 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] (duration: 07m 34s)
* 12:29 kharlan@deploy1003: kharlan: Continuing with sync
* 12:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1266223{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]], [[gerrit:1266222{{!}}Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)]]
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
* 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
* 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
* 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
* 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
* 11:33 moritzm: installing tomcat10 security updates
* 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
* 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
* 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
* 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
* 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
* 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
* 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
* 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
* 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
* 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
* 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
* 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
* 10:13 jmm@dns1004: END - running authdns-update
* 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:11 jmm@dns1004: START - running authdns-update
* 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
* 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
* 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
* 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
* 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
* 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
* 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
* 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
* 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
* 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
* 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
* 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
* 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
* 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
* 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
* 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin ([[phab:T406724|T406724]])
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
* 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:44 moritzm: installing Apache security updates
* 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
* 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
* 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
* 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
* 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
* 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
* 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
* 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
* 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs [[phab:T420480|T420480]]
* 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
* 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 [[phab:T419637|T419637]] [[phab:T410975|T410975]]
* 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
* 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
* 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
* 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 moritzm: installing postgresql security updates
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
* 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
* 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
* 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
* 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
* 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist [[phab:T421353|T421353]]
* 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis [[phab:T420093|T420093]]
* 05:26 marostegui: Drop global_block_whitelist on closed wikis [[phab:T420525|T420525]]
* 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 08m 35s)
* 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 00:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265629{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] (duration: 12m 40s)
* 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:29 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]] synced to the testservers (see https://wikitech.wiki
* 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265631{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265630{{!}}util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589)]], [[gerrit:1265632{{!}}Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)]]
* 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] (duration: 06m 50s)
* 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
* 00:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1265623{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]], [[gerrit:1265624{{!}}LinksUpdate: Consolidate links virtual domains (T421914)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
3ta406lqd5vpb96k9lb42wyxponmqav
User talk:DCaro (WMF)
3
446778
2398872
2398851
2026-04-06T08:08:15Z
MBH
3865
/* Webservice on Toolforge */ reply ([[mw:c:Special:MyLanguage/User:JWBTH/CD|CD]])
2398872
wikitext
text/x-wiki
== Welcome to Toolforge! ==
Hello David Caro, welcome to the Toolforge project! Your request for access was processed, and you should be able to use ssh to connect to <tt>login.toolforge.org</tt>. You will need to logout and login again at https://toolsadmin.wikimedia.org/ to activate your new permissions there.
Check the [[Help:Toolforge|Toolforge help page]] for tips on using your account. You can also ask questions in our IRC channel at {{irc|wikimedia-cloud}} or send an e-mail to our mailing list <tt>cloud@lists.wikimedia.org</tt>.
Thank you, and have fun making Tools! --[[User:StrikerBot|StrikerBot]] ([[User talk:StrikerBot|talk]]) 13:53, 2 November 2020 (UTC)
== Wikimedia Hackathon Northwestern Europe 2026 ==
Hello! I came across your name on a previous Wikimedia hackathon participant page, so I thought you might be interested in this.
We're organizing the [[mw:Wikimedia Hackathon Northwestern Europe 2026|Wikimedia Hackathon Northwestern Europe 2026]], taking place on '''13–14 March 2026''' in '''Arnhem, the Netherlands'''. It's a two-day, in-person hackathon for technical Wikimedians from the region.
Since you've attended a hackathon before, you already know how valuable these events can be for collaboration, learning, and getting things done together. We'd love to have you join us!
[https://docs.google.com/forms/d/e/1FAIpQLSdYOnOg1iq-8M4xWw8foHUw_7fReWTKtVH_GHzGI2_ozWww9Q/viewform '''Apply here'''] – registration closes mid-January or when full.
Feel free to reach out if you have any questions. Hope to see you in Arnhem! [[User:Daanvr|Daanvr]] ([[User talk:Daanvr|talk]]) 14:59, 12 January 2026 (UTC)
== Webservice on Toolforge ==
Hello. Two years ago you helped me to transfer my webservice from grid to k8s using buildservice-created image. You advised me to switch from CGI/C#/mono to modern dotnet, having native linux support. Now I plan to do it, I created a simple dotnet app in Visual Studio, but what to do next, how to run it on Toolforge? Two years ago you say it should be an image, created by buildservice, too, like my old code. Could you explain how should I built a new image and run it on Toolforge? My current building scripts are here: https://github.com/Saisengen/wikibots/tree/main (you have wrote them). [[User:MBH|MBH]] ([[User talk:MBH|talk]]) 12:46, 1 April 2026 (UTC)
:Hi @[[User:MBH|MBH]]! I'm a bit busy right now. You can create the new code in a new repository, and use [[Help:Toolforge/My first .NET tool]] to build + deploy it, when building, you can also give the image a different name, like <nowiki><code>toolforge build start --image-name my-new-web </nowiki>https://gitlab.wikimedia.org/toolforge-repos/sample-dotnet-buildpack-app.git</code>. I can try to give it a look eventually, but I can't commit to anything soon. [[User:DCaro (WMF)|DCaro (WMF)]] ([[User talk:DCaro (WMF)|talk]]) 15:27, 1 April 2026 (UTC)
:: Thanks. I created a repo (https://github.com/Saisengen/webservice/tree/master) and successfully built it, now I have one more issue. When I got 500 server error on Toolforge (when debugging locally on my PC, there is no this issue), error message doesn't written into ''error.log'' file on Toolforge filesystem, unlike early behavior. Where could I read error messages? [[User:MBH|MBH]] ([[User talk:MBH|talk]]) 11:15, 3 April 2026 (UTC)
:::You can try <nowiki><code>toolforge we service logs</code></nowiki> (note that if the logs are older than 1h, you'll have to pass <nowiki><code>--since 5h</code></nowiki> for example) [[User:DCaro (WMF)|DCaro (WMF)]] ([[User talk:DCaro (WMF)|talk]]) 15:07, 3 April 2026 (UTC)
:::: ''tools.mbh@tools-bastion-15:~$ toolforge we service logs<br> Usage: toolforge [OPTIONS] COMMAND [ARGS]...<br> Try 'toolforge --help' for help.<br> Error: No such command 'we'.'' [[User:MBH|MBH]] ([[User talk:MBH|talk]]) 04:32, 4 April 2026 (UTC)
:::::that was a typo xd, it's "webservice" not "we service" [[User:DCaro (WMF)|DCaro (WMF)]] ([[User talk:DCaro (WMF)|talk]]) 15:09, 4 April 2026 (UTC)
: Thanks. One more question: when I need to read a static file from Toolforge filesystem, I read it by absolute path (/data/project/mbh/file.txt) and it works. But when I want to read a file, included into build (https://github.com/Saisengen/webservice/blob/master/cpf.html for example) and try to read it by relative path (just StringReader("cpf.html")), it doesn't work. How to read files from my project? [[User:MBH|MBH]] ([[User talk:MBH|talk]]) 08:08, 6 April 2026 (UTC)
4etertryrhzonsns6wmo7uyo2axfo7l
Map of database maintenance
0
449160
2398864
2398855
2026-04-06T00:01:22Z
Dexbot
30554
Bot: Updating the report
2398864
wikitext
text/x-wiki
{{/Header}}
== Today (2026-04-06) ==
== Yesterday (2026-04-05) ==
== Last seven days ==
{| class="wikitable"
|+ eqiad
|-
! Section !! Work
|-
| s2 || [[phab:T419635|Drop il_to column from imagelinks table in wmf production (T419635)]] (fceratto)
|-
| s5 || [[phab:T419635|Drop il_to column from imagelinks table in wmf production (T419635)]] (fceratto)
|-
| s6 || [[phab:T419635|Drop il_to column from imagelinks table in wmf production (T419635)]] (fceratto)
|-
|}
{| class="wikitable"
|+ codfw
|-
! Section !! Work
|-
| s2 || [[phab:T419635|Drop il_to column from imagelinks table in wmf production (T419635)]] (fceratto)
|-
| s3 || [[phab:T419635|Drop il_to column from imagelinks table in wmf production (T419635)]] (fceratto)
|-
| s5 || [[phab:T419635|Drop il_to column from imagelinks table in wmf production (T419635)]] (fceratto)
|-
|}
[[Category:MariaDB]]
gokkasg8lie0dpoghibuzgvupfr02jd
Nova Resource:Tools.wikivisage/SAL
498
459970
2398899
2398852
2026-04-06T10:32:18Z
Stashbot
7414
wmbot~difronzo@tools-bastion-15: deployed v0.7.5 (by DiFronzo on GitHub)
2398899
wikitext
text/x-wiki
=== 2026-04-06 ===
* 10:32 wmbot~difronzo@tools-bastion-15: deployed v0.7.5 (by DiFronzo on GitHub)
=== 2026-04-04 ===
* 15:55 wmbot~difronzo@tools-bastion-15: deployed v0.7.4 (by DiFronzo on GitHub)
=== 2026-04-02 ===
* 14:03 wmbot~difronzo@tools-bastion-15: deployed v0.7.3 (by DiFronzo on GitHub)
=== 2026-03-25 ===
* 20:28 wmbot~difronzo@tools-bastion-15: deployed v0.7.1 (by DiFronzo on GitHub)
=== 2026-03-24 ===
* 21:16 wmbot~difronzo@tools-bastion-15: deployed v0.7.0 with database wipe (by DiFronzo on GitHub)
=== 2026-03-23 ===
* 17:27 wmbot~difronzo@tools-bastion-15: deployed v0.6.7 (by DiFronzo)
<noinclude>[[Category:SAL]]</noinclude>
7kum6cc74vgn2d6hysillwpj2lbt27h