ITIL
Alert
Monitoring Tools will triggers an Alert in the form of Emails, When it reaches the threshold values that we set on monitoring tools for our applications (or) Servers.
Usually, we have two types of Alerts.
Warning & Critical
85% 90%
(Disk Space, Memory/Swap Memory, Configuration file)
Eg:-
when the application is down (or) not reachable, it sends an Alert.
when the critical Alert (90%) is not taken care immediately the threshold values could be increased to 100% and it can lead to an incident.
(Server may Hang UP (or) the processes can be Halt)
Incident Management
If there is an interruption to the business it can lead to an incident.
Levels of Severity
Sev->1=>Critical--->p1
Sev->2=>High--->p2
Sev->3=>Medium--->p3
Sev->4=>Low--->p4
Sev1
If there is complete loss of business (or) the application is down. Then it can be treated as Sev1.
Sev2:-
If there is no financial loss but some functionalities are not working (Eg:- The reports could not be downloaded) (or) the Search is not working but it is an incident, it can be treated as Sev2/
Sev3:-
Usually if there is a minor impact, to the internal users.
Eg:- some buttons are not working (or) the images are not loading (rendering) properly.
It can be treated as medium incident.
Sev4
A low incident can be treated whenever there is an Alert, that should be addresses as soon as possible.
If there is complete loss of business (or) the application is down. Then it can be treated as Sev1.
Sev2:-
If there is no financial loss but some functionalities are not working (Eg:- The reports could not be downloaded) (or) the Search is not working but it is an incident, it can be treated as Sev2/
Sev3:-
Usually if there is a minor impact, to the internal users.
Eg:- some buttons are not working (or) the images are not loading (rendering) properly.
It can be treated as medium incident.
Sev4
A low incident can be treated whenever there is an Alert, that should be addresses as soon as possible.
Change Management
If there is any conf changes needed on the environment (DEV=>TEST=>STAGE=>PROD)
we will first do the changes in the Non-Production environment (change request is not needed).
when we have to perform the change in production environment, we have to submit a change request in the CAB Meetings.
change Advisory Board-CAB
Once the change is approved by the CAB, (we have to prove that the change is implemented in the non-production) environment.
(DEV=>TEST=>STAGE) Successfully and we should have a roll back plan ready before the CAB meeting starts OK.
The change Manager will review and approve the change to implement it in a maintenence window.
Service Request
Usually, we will receive Service requests (REQ's) from other teams, (Conf changes, deployments, load tests etc...)
Some times, we will raise Service requests to other infra teams.
UNIX Team
To create server accounts, to clumb the listen addresses, to install any software, which is needed root permission (third party tools).
To clear the Disk Space under /var (or) to get sudo access (or) any other access on the network
DB Team
we will request to the DB team incase we need a data source passwords (or) if there is an account lock. we will request them to the DB team to unlock the accounts.
NOTE:-
Before they Unlock the Service account, we should make sure we have not entered a wrong password for that user any where.
N/W Team
To open a firewall port, to service any IP Address/port, to create a DNS)
Comments
Post a Comment