IT Infrastructure Library (ITIL)
Mission Statement Strategy Tactics Planning Operations day-to-day Business IT Alignment Service Delivery Service SupportIT Service Management Overview
BUSINESS (Customer)
User User User
SD IM PROBLEM CH REL CONFIGURATION SLM AM CM IT SCM FINANCE SERVICE SUPPORT SERVICE DELIVERY SPOC SLA
SPOC Single Point of Contact SD Service Desk
IM Incident Mngt CH Change Mngt REL Release Mngt
SLA Service Level Agreement SLM Service Level Mngt AM Availability Mngt CM Capacity Mngt
IM PM USER INCIDENT INCIDENT DB PROBLEM DB KE DB Description & Solution in plain English PROBLEM SD / IM IM
One or more incidents with unknown cause
PM KNOWN ERROR If business case to fix? Find Root cause & temp fix workaround OR permanent fix PM PM STOP NO Raise RFC YES CHANGE MANAGEMENT PROBLEM CONTROL ERROR CONTROL PM RRS PM
Training Plan for Service Desk Agent
1. Health & Safety 2. Data Protection 3. Custom Service Skills 4. Business Awareness
5. IT skills in supported applications
6. How to use the Service Desk tools (e.g. Clarify) 7. Service Desk procedures
8. SLAs being supported
9. Baseline fixes (e.g. passwords)
10.Contacts & hand0offs to IT Support Manager & Suppliers 11.Overview of ITIL’s view of IT Support Managers
Service Desk Activities
Single Point of Contact Log all Incidents
Resolve Incidents using KE DB Escalation Service Requirements Reporting Trends Workarounds Monitor Track Information Requests Categorisation Prioritisation Closure
Refer to Second Line First Investigation First Diagnosis Recovery
Keep User Informed
The Service Desk Personality
SD Patience Communicative Confidence Enthusiastic Friendly Empathetic Assertive Literate Numerate Honest Forthright Condescending Aggressive Technical SpecialistClassification
Categorisation of an Incident
E.g. Hardware, Software, Documentation, User Error
Prioritisation of that Incident
SD IMPM
Influenced by
EFFECT
What will it be on
the business and
number of users?
SPEED
How quickly is a fix
needed?
Priority 1 1 2 1 2 3 2 1 3 3 1 2 3 1 2 3 High Low High LowKnown Error
An error is only a KNOWN ERROR when…
Problem Activities
PMRoot Cause Temporary Fix Permanent Fix
&
OR
Reactive
Problem Control Error Control Major Incident SupportProactive
Management Information Major Problem Review
Change Advisory Board (CAB) Members
Business and IT
Outputs from CAB Meetings
CAB Minutes
Forward Schedule of Change (FCS) – Gantt Table (3-6 months out) Projected Service Availability (PSA) – Gantt table (3-6 months out)
CH
Chair Change Manager
IT
Representatives
Service Desk Manager Problem manager Application Manager Operations Manager Security Manager Business Representatives Senior Users Finance Representatives As Appropriate
Category
CHCategory Notes Assessor/Approver
MINOR
Affects few Users Affects single service Low or no cost
Little implementation resource
Low risk
Change Manager
Significant Between MINOR and MAJOR CAB
MAJOR
Affects many Users Affects several services Substantial cost Require implementation teams High risk Usually a project Executive Management (Board)
Configuration Activities
1. Planning
2. Identification of Configuration item (CI) 3. Control
4. Status Accounting 5. Verification & Audit
IT Infrastructure
Hardware Software Documentation IT Staff CFig Process & Procedure Technical Documentation Diagrams Organisational ChartsConfiguration Attributes
An example could be a PC:
Hardware configuration
Serial Number / Model Number / Asset Number Operating System
Location / User
Date of Purchase /Warranty Period Supplier / Support Contact Details Type of Support
Service History
Audit Trail
Business Unit as Owner Purchase Price
CI Variant information – e.g. the keyboard is French Processor Type/Speed Memory Disks Network IP MAC Hub Port # Incident #s Problem #s RFCs CFig
ABC of Finance
Cost Types
ransfer ardware xternal oftware Fin Mandatory Optional Main Frame Server Desk Top Network Cost Elements Operating System Applications Utilityeople
ccommodation
Recovery
ITSCM>72 Hours
Empty Computer Space
Remote Centre (External) / Portable
24-72 Hours
Filled Computer Space – No Data Remote Centre (External) / Portable
0-8 Hours
Filled Computer Space with Data Remote Centre (External)
Methodology for Managing Risk
ITSCMC
T
A
I
S
K
N
A
L
Y
S
I
S
A
N
A
G
E
M
E
N
T
&
E
T
H
O
D
O
L
O
G
Y
Demand
Examples:
Call at Service Desk CPU Utilisation
Network Bandwidth Utilisation Connects to Server
CM
Frequency
0700 1000 1300 1700
An alert will trigger when the demand is about to exceed the capacity
Component Failure Impact Analysis (CFIA)
Each component can be assessed to see how many services depend on it. From here, each component can be weighted to highlight the high impact components. E.g. If #3 breaks down, this will have a high impact on many services
Time AM Services A B C XX XY TOTAL #1 #2 #3 - n #n TOTAL 23
8 40 7 1
18 24 9 21 11 Components (CIs)Fault Tree Analysis
The Fault Analysis Tree uses Boolean Logic (AND / OR) whereby the components can or cannot be used depending on the location of the fault.
AM SERVER SERVER SERVER SERVER ROUTER ROUTER ROUTER B A C D E F G H I J K L M N O P Q R
Considerations
AMA
R
R
M
S
S
vailability
esilience
eliability
aintainability
erviceability
Managed through OLAs (Operational Level Agreements) Resource Capacity Management Managed through UCs (Underpinning Contracts)Calculating Availability
% Availability =
Example:
AST = 40 Hrs
DT = 20Hrs
Therefore:
% Availability =
ecurity
CIA – Confidentiality Integrity Availability AMAgreed Service
Time
Down Time
Agreed Service
Time
X 100 %
40
2
40
X 100 %
38
X 100 %
=
=
95 %
Summary of Responsibilities to Produce and Maintain
Note!
RISK is found in Change Management, Availability Management and IT Service Continuity Management.
Quick Reference Key Words SERVICE DESK
40
AMDesign for Availability
Manage Availability
Component Failure Impact Analysis (CFIA) Service Outage Analysis (SOA) Fault Tree
Analysis (FTA) Observation PostTechnical (TOP)
Function not a Process
Central Point of Contact – Increase user perception and satisfaction Support for business goals
Log ALL Incidents
Provide First Line support
Second Line support – Generalists / Ops Support Third Line support – Specialists
Produce measurement metrics Categorise Incidents
INCIDENT MANAGEMENT
Incident is an event not part of standard service Incident causes disruption
Incident causes a reduction in service An incident has a ‘Life Cycle’
1. Detection and Recording 2. Classifiaction and Support 3. Investigation and Diagnosis 4. Resolutioon and Recovery 5. Closure
Note! The Expanded Incident Life Cycle is part of AVAILABILITY MANAGEMENT
and focuses on the Availability of the system MTBSI, MTTR, MTBF. IMPACT = EFFECT
URGENCY = SPEED
PRIORITY = Based on IMPACT and URGENCY ESCALATION
1. Functional - Across support teams (Competence) 2. Hierarchical - Up Management line (Authority)
PROBLEM MANAGEMENT
Problem is unknown underlying cause of one or more incidents.
Known Error is when the root cause is known and a workaround has been found.
To find the root cause we need to undertake ROOT CAUSE ANALYSIS. Two main elements of Problem Management
1. Error Control 2. Problem Control
Proactive as well as Reactive – Known Error Database RFC are required to resolve Known Errors
CONFIGURATION MANAGEMENT
All CIs are held in a CMDB this is NOT the DSL
Configuration Items are RECORDS and as such are made up of ATTRIBUTES
There are always 2 Key Attributes 1. Unique Identifier
2. CI Type ID
3. There may be a VARIANT attribute
CMDB is not an ASSET register because CMDB records RELATIONSHIPS between CIs (Parent-Child).
The Base Level is the lowest level, which CIs are UNIQUELY identified. The Baseline is a SNAPSHOT in time of the CMDB.
Status Accounting is the reporting of all current and historical data of each CI. e.g. Life Cycle of hardware component
1. Ordered 2. Delivered 3. Tested 4. Installed 5. Under Repair 6. Retired CHANGE MANAGEMENT
Requests For Change – details of requested change
Forward Schedule of Changes – details of changes scheduled for
implementation
Projected Service Availability – details of changes to agreed SLA’s because
of FSC
Change Category
1. MINOR – Change Manager approval
2. SIGNIFICANT – Change Advisory Board approval 3. MAJOR – Board approval
4. URGENT/EMERGENCY – CAB/Emergency Committee (EC) approval Change Process
1. Register, Accept, Prioritise 2. Category, Authorise
3. Build, Test, Schedule Note! Not necessarily FULL testing 4. Implement, Back out
5. Review, Close
RELEASE MANAGEMENT
Definitive Software Library – Library where ALL authorised versions of
software are stored and protected. A Physical library or storage repository where master copies of software versions are kept. This one logical store may consist of one or more physical software libraries or filestores.
Definitive Hardware Store – An area set aside for the secure storage of definitive hardware spares.
Release definitions
1. Release - a collection of authorised changes to an IT service 2. Release Unit – Portion of IT Infrastructure normally released
together
3. Roll-out – deliver, install and commission an integrated set of new or changed CIs across logical or physical parts of the organisation.
Release Types
1. Delta – Partial release of CI’s that have changed or are new since last release.
2. Package – Individual releases FULL units, DELTA or both are grouped together to form a Package Release.
3. Full – All components of the release unit are built, tested, distributed and implemented together.
SERVICE LEVEL MANAGEMENT
Service Catalogue is a list of offerings
Service Level Requirements deals with AMOUNTS (response Metrics)
Service Level Agreement is a CLIENT / SUPPLIER Non Technical document NOT a Contract
Operational Level Agreement Internal Technical or Non Technical document Underpinning Contract is with 3rd Partys
Cost Focus
Demings Quality Circle – Plan Do Check Act
Financial Management for IT Services
A COST MODEL must be defined and agreed BEFORE you can CHARGE CHARGE Types 1. At Cost 2. Cost Plus 3. Going Rate 4. Market Rate 5. Fixed Price
An OVERHEAD is the TOTAL cost of INDIRECT materials e.g. Wages and Expenses
CAPITAL Costs apply to Fixed Substantial assets of the organisation e.g. Building
OPERATIONAL Costs result from Day to Day running of IT Services section e.g. Staff Costs
DIRECT Costs are those that can be traced in FULL to a product, service, cost centre or department e.g. direct wages of a member of Staff or Staff Type i.e. Contractors
INDIRET costs cannot be traced directly to and in full to a product, service or department. It may be a cost spread across a number of departments. It’s an APPORTIONED cost or OVERHEAD.
Cost Types – THE SPA
Efficiency and Effectiveness are related to COST.
CAPACITY MANAGEMENT
Business Capacity Management – FUTURE business requirements Service Capacity Management – CURRENT Service delivery
Resource Capacity Management – UNDERLYING resource components Demand Management – Differential Charging
Modelling 1. TREND 2. ANALYTICAL 3. SIMULATION
4. BASELINE (What if?)
Application Sizing – Supply v Demand, Cost v Capacity
There are Capacity THRESHOLDS that when exceeded generate ALERTS
IT SERVICE CONTINUITY MANAGEMENT
Reduction in the VULNERABILITY of the IT service RISK Analysis – Threats to assets CRAMM
Countermeasures – Recovery Options 1. Do Nothing
2. Manual Backup
3. Reciprocal Arrangements
4. COLD Gradual Recovery > 72 hrs
5. WARM Intermediate Recovery 24 to 72 hrs 6. HOT Immediate recovery 0 to 8 hrs
Note! Cold, Warm and Hot recovery options should be at a REMOTE location Increases customer confidence and can reduce Insurance premiums.
AVAILABILITY MANAGEMENT
ABILITY but NOT VULNERABILITY Availability of service
Reliability – Maintainability} OLA Managed Serviceability SLA Managed Security is Confidence, Integrity and Availability