Hacker News Comments on
Mastering Outages with Incident Command for DevOps: Learning from the Fire Department
IT Revolution
·
Youtube
·
63
HN points
·
0
HN comments
- This course is unranked · view top recommended courses
Hacker News Stories and Comments
All the comments and stories posted to Hacker News that reference this video.⬐ janzerFor anyone wanting to dig into the ICS the authoritative introduction is the NIMS IS-100[1] course. Be aware that it's your standard government produced course, i.e. not entertaining in any way but generally manages to get the point across.If I understand/remember correctly, IS-100 and 200 are the required courses on ICS (along with 700, 800 for NIMS itself) for all frontline firefighters on any fire department wanting to receive federal money.
1. https://training.fema.gov/is/courseoverview.aspx?code=IS-100...
⬐ mbubb⬐ NoneYes these are dry... I took them as part of a CERT org a few years ago and retook for EMS operations chapter in a class right now.None⬐ mbubbCoincidentally am finishing up an EMT class and prepping for the NREMT test. I have been very struck by the similarity between the differential diagnosis process an emt uses and the troubleshooting a server or network issue. Similar problems too when you get tunnel vision (It must be asthma/It must be the DNS resolver... etc). Nice video!⬐ monkmartinezICS seems like a natural fit for so many professions. I wonder if there is a future for a Fire Officer to teach/adapt ICS for <insert profession>?⬐ deadmanwalking⬐ mlosapioI know of at least one company that has an Ex-fire chief from the US as well as other services specifically to teach IC to IT and other areas - http://www.blackrock3.com/Really useful training, have had to use it multiple times to coordinate responses ransomware, and other IT disasters, and does benefit from the buy in of senior leadership at the company I work for.
Biggest challenge we have is keeping ICs after deciding to centralise the IC organisation in 3 locations, all ICs were offered the choice to relocate or leave.
We have a lot of very new ICs.
⬐ chanandler_bongI had a 10 year background in fire/EMS before getting in to tech, and have "sneaked in" ICS concepts and practices to several of my employers and teams.I tried to introduce the ICS concepts formally and up-front, but met a lot of pushback; "we don't need that", "we're not dealing with fires" or "it's too complicated".
By using ICS principles without calling them such, people usually see the value. I even got a sizable promotion and raise due to my "clear and concise handling of several serious incidents and putting procedures in place to handle similar in the future". All I did was direct people in to ICS functions and act as an IC.
I agree that ICS concepts should use used more widely and outside of just emergency services, but getting past the "stigma" of the title is the hard part.
Volunteer Firefighter and SRE here.ICS is crucial in any and all of our incidents and should be the model on how any disaster is handled
⬐ nodesocket"While I go work with Slack for a couple years..."Can't be encouraging for Slack, couple of years. But, that's the deal now, people leave tech jobs every 1-5 years for something better or start their own company.
⬐ sokoloff⬐ bmvIf tech companies want employees to stay longer, they need to work to make that the best option for their employees. If they don’t, we shouldn’t be surprised that employees leave to take a better option.⬐ brentchapmanI’m the speaker in that video. Historically, on average, I’ve stayed at each of my employers for about 2 years. Google was a big exception, as I was there for almost 6 years, but partly that was because I had 3 fairly different roles during the time I was there.So far, a month in, I’m loving working for Slack; it’s a great company, and an excellent group of people. So there’s every possibility that I’ll be there longer than 2 years!
test⬐ KineticLensmanICS like this is well already well developed for dealing with cyber incidents, e.g. malware incursion or data exfiltration. A really well thought through example is the NIST cyber security framework [0], which defines a cyber defensive lifecycle (identify, protect, detect, respond and recover). At least in the UK, this lifecycle has been adopted by Critical National Infrastructure (CNI) organisations such as electricity generation, transmission and distribution.The key to successful incident response is to design, agree and test the ICS processes, roles and Command and Control (C2) hierarchies before incidents occur, capturing the results in an Incident Response Plan. The IRP will typically involve standing up an incident response team when an incident reaches a pre-defined severity level. Incident response teams are often structured into bronze, silver and gold levels of command (gold typically including Cxx individuals such as CIO and CFO, not just DevOps roles) that temporarily replace Business as Usual (BAU) management with incident C2. In the UK at least, the bronze/silver/gold hierarchy was developed by the ‘blue-light’ services (police, fire and ambulance), and is itself a simplified version of military C2 chains of command.
If management don't buy into this type of approach, it doesn't node well for an organisation's ability to deal with a crisis.