Ed Batalla

Transcription

Ed Batalla
NERC Monitoring and Situational
Awareness Conference: Loss of Control
Center Procedures and Testing Practices
Ed Batalla – Director of Technology
Florida Power & Light Company
September 19, 2013
Agenda
• Florida Power & Light Company (FPL) Overview
– FPL EMS Overview
– FPL Control Centers / Infrastructure Philosophy
• Problem Statement
– EMS Availability
– Viability of Backup Processes
• Solutions
– PLAN: Solidify Backup Procedures
– PREPARE: Failover Testing Processes
– MITIGATE: Improve Alternative Real-Time Assessment
Tools/Methods
• Q&A
2
Florida Power & Light (FPL) is the largest electric utility in
Florida and one of the largest rate-regulated utilities in the
United States
FPL Overview
• FPL is a subsidiary of
NextEra Energy, Inc. (NEE)
• One of the largest U.S.
electric utilities
• Vertically integrated, retail
rate-regulated
• 4.6 MM customer accounts
• 24,653 MW in operation
• $10.1 B in operating
revenues
• $36 B in total assets
NOTE: All data as of June 30, 2013, except operating revenue which is for the year ended December 31, 2012.
3
FPL’s Energy Management System (EMS) was upgraded on
11/10/2012 – it is a major component of the suite of missioncritical systems grouped as Grid Control Systems (GCS)
Energy Management System (EMS) Overview
• Vendor:
– Commissioned: 11/10/2012
– Version:
• Benefits of the upgrade
– Improved redundancy and
geographic diversity
– Advanced reliability tools
– Enhanced cyber security
(user authentication)
• EMS Interfaces
–
–
–
–
Power Plants
Substations (T&D)
External Utilities
Distribution Control
Centers
– Performance Diagnostic
Centers
– Historian Systems
– Other Corporate Systems
FPL’s EMS was upgraded to modernize system technology and
infrastructure
4
FPL has geographically diverse and redundant control
centers which contain the Grid Control Systems (GCS) – an
integral part of our business continuity and recovery plans
Backup Control
Center (BUCC)
Facility
– FPL EMS is also
used by FRCC
Reliability
Coordinator (RC)
Fully functional backup control
center capable of being activated
within the required standards and
has tools that closely replicate the
primary control center and
minimizes activation confusion
– Distribution
Control Centers
(use and access
same EMS)
System Control
Center (SCC) Facility
Local Backup Control
Center
Remote Backup
Control Center
5
The control room has
emergency communication
methods for loss of the primary
communication tools; also has
access to the Backup Control
Center
FPL’s EMS uses state-of-the-art infrastructure systems and
technology
Infrastructure Design/Philosophy
• Geographically diverse and redundant control center
facilities
– Each facility equipped with redundant systems, but each facility
backs up the other facility
– Connected via dedicated and redundant communication links
• Diverse and redundant communications to all FPL
substations
– SCADA data from FPL substations are dual-scanned from both
the primary and backup control center facilities
• Cybersecurity via defense-in-depth philosophy (logical
separation of control center network from corporate
network)
6
There are event categories directly related to the loss of
monitoring or control functionality for control centers –
procedures and testing practices need to be strengthened
Event Categories
• Category 1 Events
– Unplanned evacuation from a control center
facility with BPS SCADA functionality
– Loss of monitoring or control, at a control
center, such that it significantly affects the
entity’s ability to make operating decisions
• Category 2 Events
– Complete loss of all BPS control center voice
communication systems
– Complete loss of SCADA, control or
monitoring functionality
Source: Electric Reliability Organization Event Analysis Process – Version 2 (July 2013)
7
FPL implemented solutions to improve control center
procedures and testing practices
Solutions
• PLAN: Solidify recovery and
business continuity plans
• PREPARE: Validate backup
control center processes
through actual technology
viability validation
– Failover processes
– Track performance
• MITIGATE: Continuous
improvements on alternative
real-time assessment tools
Measuring performance is key to ensure backup process viability
8
FPL has extensive business continuity / recovery plans for
all electronic systems
Business Continuity Plan (BCP) / Recovery Plan (RP)
• Primary Control Center Facility
– Manned with Operators (RC, TOP, BA, IA)
– Technology Support
Provided through Operational Technology Center (OTC) and
callout support
• Backup Control Center
– Full redundant system (with diverse communication paths)
– Unmanned – no Operators
BCP and RP specify criteria for evacuation
– Technology Support -- remotely supported (technical team
located at primary facility)
The recovery plan is reviewed and tested annually pursuant to NERC CIP
Standards
9
BCP and RP has defined a set of criteria to evacuate the
primary control facility
Criteria for Evacuation
• Evacuation Criteria
– Incapacitated facility (fire, terrorist attack)
– Total loss of building power supply
– Critical function unavailability at the primary control center facility
With no ability to connect to backup control center servers
Redundant pair EMS or SCADA Front End not available
• Evacuation Process and Interim Provisions
– Since the backup control center is unmanned, FPL implemented
interim control centers to facilitate evacuation process to meet
the EOP Standards
Local Backup Control Center (adjacent building)
Remote Backup Control Center (approx. 5 miles away)
10
Health of the system is monitored constantly by 24/7
technology personnel
System and Application Health Check
FPL established the Operational Technology Center (OTC) organization to
improve “operational certainty”
11
Improved monitoring process capabilities with alarming on
CA failures
CA Solution Progress Monitoring
Dashboards were
developed to improve
situational awareness on
real-time assessment tool
viability
OTC and operators are instructed to be on the high alert for CA yellow
and red bar displays
12
A “CONS OPS” button has been added to access
Conservative Operations display shows contingency data at
key interfaces (in case CA is not available)
Conservative Operations
“Cons Ops”
Button
13
FPL’s EMS/SCADA uses high-speed replication of data
between two control center sites
MRS to Support Failover Schemes
• Memory Replication
Services (MRS) is used to
replicate data between
systems
• Essential to support
failover schemes
– Critical operational data is
replicated automatically
MRS provides a state-of-the-art, high availability configuration and
failover scheme for FPL’s EMS/SCADA
14
FPL purposely performs a weekly scheduled failover of its
EMS (and corresponding peripheral systems) to ensure
viability of its critical systems (primary and backup systems)
Testing Practices
• Weekly Scheduled Failover Test
– Cycle through different system configuration to ensure viability
– Ensures that failover logic is fully functional
– Coordinated with all users (done every Wednesday morning)
• Other
– EOP processes and procedures reinforced during operator
training
Operators cycle through familiarization of backup systems
and control centers
– CIP required test of recovery plan fulfilled (at least annually)
15
Unplanned EMS Unavailability
Cumulative Downtime YTD = 28.98 minutes
EMS Unplanned Unavailability
Target (Max minutes downtime)
50
45
40
35
30
G
O
O
D
25
20
15
10
5
0
Jan
Feb
Mar
Apr
May
Jun
Planned and Unplanned
Jul
Aug
Sep
Oct
Nov
Dec
Unplanned
FPL tracks cumulative EMS unavailability as part of its key performance
indicators
16
Depending on the situation, FPL incorporated processes for
operators to access alternative real-time assessment tools
Alternative Real-Time Assessment Tools
• Flowgate Display
– Pre-calculated flowgate limits
– Uses data that is not dependent on
the EMS
– Other displays were developed that
uses non-EMS data sources
• CA solution sharing between
reliability entities within the FRCC
– Currently being piloted
17
FPL has three major focus areas for its continuous
improvement plan to continue to strengthen control center
procedures and testing practices
Final Note
• Reliability entities need to continue to strengthen control
center procedures and testing practices
• FPL has three major focus areas for its continuous
improvement plan:
– PLAN: Solidify recovery and business continuity plans
– PREPARE: Validate backup control center processes through
actual technology viability validation
– MITIGATE: Continuous improvements on alternative real-time
assessment tools
18