Monday, May 12, 2025

Prometheus - Write a Query through PromQL - Part 2

May 12, 2025 0

 

PromQL:
PromQL expressions has classified as follows:
Instant Vector
Range Vector
Scalar
String
Instant and Range vector has a basic syntax consist of metric name along with label matchers. A metric name in Prometheus is a label named __name__ under the value. 
example:
{__name__="prometheus_build_info"}
PromQL will query directly by name [prometheus_build_info]

Label matchers are defined using a label key, a label matching operator, and the label value. Some possible label-matching operators:
=: The label value matches the specified string
!=: The label value does not match the specified string
=~: The label value matches the regex in the specified string
!~: The label value does not match the regex in the specified string

Instant Vector are selectors that select the data at this instant using a recent available data for the time series which you are selected. 
Range Vector is retriving data over a given time range. This time range is specified as duration followed by unit.
The following are valid units for a duration:

ms: Milliseconds
s: Seconds
m: Minutes
h: Hours
d: Days (assumes a day is always 24 hours)
w: Weeks (assumes a week is always 7 days)
y: Years (assumes a year is always 365 days)

These range vector selectors are often more useful when aggregating data to perform an analysis to drive results such as the number of requests per second a service experienced over a particular time range.
Prometheus has two primary API endpoints that are used to retrive time series data /api/v1/query and /api/v1/query_range.
Note : Both instant vectors and range vectors are valid for the /query endpoint, but only instant vectors are valid for the /query_range endpoint.
Offsets:
It will allow us to pull the time series data from the past based on offset values.
Example:
my_cool_custom_metric offset 5m
Group Modifier and Vector Matching:
Prometheus has provided ways to join the labels of different matrics together when using the group_left and group_right query modifiers. This option allows for many-to-one and one-to-many matching of vectors.
Logical and set binary operators
PromQL used the boolean operators of and,or and unless query operators
Example:
vector1 and vector2
It requires that for every series in vector1, there must be an exactly matching label set in vector2 (besides the __name__ label). If there is no match for the label set of a series in vector1 inside of vector2, then that series is not included in the results.  

Sunday, May 4, 2025

Prometheus - Time Series Database - Part 1

May 04, 2025 0

 


Prometheus Installation:

We can install the Prometheus in different ways.
  •  Install through source
  • Docker container
  • Install through script
  • Install through package manager
Prometheus with a metrics comes from node_exporter process from remote hosts. Each metrics has associated with one or more-time series data on it.
Events are point-in-time of record of action occurring during that time.
Prometheus is using a pull-based model for extracting the data from systems it monitors whereas Graphite and Nagios are using push-based model which push the metrics to a remote system.
Components of Prometheus Stack:
  • Prometheus Alert manager for routing an alert coming from Prometheus 
  • Exporters: It has multiple exporter components depends upon a device such as Node Exporter.
  • Grafana: It used to visualize metrics from Prometheus through dashboards.
Prometheus has four main components such as time series database [TSDB], scrape manager, rule manager and Web UI.
Time Series Database:
   It is a special kind of database that is optimized for storing data points in a time series manner. The core part of TSDB are head blocks [Write-ahead log] and data format (block, chunks, and indices). The head block is an entry point for samples being scraped and stored in TSDB. 
The chunk gets appending into head block, until chunk hits it sample limit until stays in memory. It will add into WAL at first before added into memory, so we will preserve the data if server is in panic or reboot at automatically.
To increase resource efficiency, some wizardry is done to the chunk data so that it only stores the direct value of the first timestamp (t0) and first value (v0). Subsequent timestamps are set to the delta of the prior timestamp, so starting at t2, all timestamps are the delta of a delta. Similarly, subsequent values are compared to their prior value using a bitwise XOR operator. This just means that the difference between the samples is stored, and if they’re the same, then 0 is stored.
Scrape Manager:
  The scrape manager is a part of Prometheus that handles pulling metrics from applications and exporters by performing scraping and maintaining an internal list of what things should be scraped by Prometheus.
Rule Manager:
  The rule manager is part of Prometheus that handles evaluating alerts and recording the rules as per Prometheus. It is handling evaluating rule groups compromised of alerts and recording rules on regular intervals. 
Web UI/API:
  The Web UI/API is the portion of Prometheus that you can access via your browser. The Prometheus UI/API is a powerful REST API. It provides integration with other graphical tools such as Grafana. 
Alert Manager:
  It is responsible to send an alert to Slack, PagerDuty and other alerting destinations.  Alert Manager handles all of that owns through routing tree-based workflow





Wednesday, April 23, 2025

Terraform [HCL] Language - Write a terraform code

April 23, 2025 0

 

Hashi Corp Language:


Expressions:
  * Expression work with values in the configuration
  * They can be simple values as text or number
  * We can be use as complex such as data, loops and conditions
   Example:
 list(tuple) - ["us-east1", "us-east-2"]
 map - { name = "user1", department = "devops"}
 bool - true or false
 
versoin = "-> 4.16"
-> will consider minior. It will allow a minor version upto .99 but it will not change a major number in our case is "4".

"*" operation:
It will allow the number in the loop and avoid overloading a variable into memory.
Example:
output "ebs_block_device" {
  description = "block device volume IDs"
  value = aws_instance.splat_lab_labs.ebs_block_device[*}.volume_id
}
Functions:
  * Function is one or more insructions that perfrom a specific task.
  * Terraform functions are used to add functionality or transform and combine values.
  Example:
 resource "aws_iam_user" "web_user" {
   name ="user-${count.index}"
   count = 5
   tags = {
     time_created = timestamp()
department = "OPS"
   }
}
Example2:
resource "aws_iam_user" "functional_user" {
  name = "functional-user"
  tags = {
    department = "OPS"
time_created = timestamp()
time2= formatdate("MM DD YYYY hh:mm ZZZ", timestamp()}
  }
}

Meta Arguments:
count - It allow to set multiple resources within a block.
Example:
resource "aws_instance" "count_test" {
  count = 2
  ami = "ami-0c7c4e3c6b4941f0f"
  instance_type = "t2.micro"
  tags = {
    Name ="Count-Test-${count.index}"
  }
}
for-each meta arugment:
A for loop is using for iterating/executing over the sequence within the block.
Example:
Creating four users while creating an AWS instance.

resource "aws_iam_user" "Accounts" {
  for_each =toset{("Shiva", "Dev", "John", "Abdul")}
  name = each.key
  }
}
Local Values:
  * Local value assigns a name to an expression that can be reused easily.
  * Use case such as various list [ports, username] & reference to other values.
Example:
resource "aws_iam_user" "accounts" {
  for_each=local.accounts
  name = each.key
}
we will define a local function within the block.
locals {
  accounts = toset {("James", "Don")}
}

Dynamic block:
The codes will be reusable within resource block. It will speed up the code execution time.

Version Constraints:
  * Version constraints are configurable strings that manage the version of software to be used with Terraform includes providers and TF version as well.
  * TF version is followed by semantic versioning (Major:Minor:Patch)

= constraint - It will allow the exact version only
!= constraint - Excludes exact version number
< > - Grater than, less than a version number
>= <= - Grater than or equal to or less than or equal to that version
~> - ONlythe rightmost number increments [minor or patch number]

Stored the state file in the remote object through TF code.
terraform {
  required_providers {
    aws = {
  source = "hasicorp/aws"
  version = "5.01"
}
  }
  required_version = " <= 1.4.6"
}

module "s3_bucket" {
  source= "terraform-aws-modules/s3-bucket/aws"
  version "3.14.0"
  bucket =""
  acl = "private"
  force_destroy = true
  
  control_object_ownership = true
  object_ownership = "ObjectWriter"
  
  versioning = {
    enabled =true
  }
|

TF state file saves in bucket and set as provide inside of backet. 

Life cycle management:
create_before_destroy : It will create instead of destroying at first
prevent_destroy : It will prevent from destroying of instance
ignore_changes : It will implement of any changes.
replace_triggered_by : It will overwrite the changes

Saturday, April 12, 2025

Use case study of CloudFormation and Terraform

April 12, 2025 0

 

Scope: CloudFormation is very powerful because it is developed and supported directly by AWS, but Terraform has a great community that always works at a fast pace to ensure new resources, and features are implemented for providers quickly.
Type: CloudFormation is a managed service by AWS, but Terraform has a CLI tool that can run from your workstation, a server, or a CI/CD system (such as Jenkins, GitHub Actions, etc.) or Terraform Cloud (a SaaS automation solution from HashiCorp).
License and support: CloudFormation is a native AWS service, and AWS Support plans cover it as well. Terraform is an enterprise product and an open source project. HashiCorp offers 24/7 support, but at the same time, the huge Terraform community and provider developers are always helpful.
Syntax/language: CloudFormation supports both JSON and YAML formats. Terraform uses HashiCorp Configuration Language (HCL), which is human-readable as well as machine-friendly.
Architecture: CloudFormation is an AWS-managed service to which you send/upload your templates for provisioning; on the other hand, Terraform is a decentralized system with which you can provision infrastructure from any workstation or server.
Modularization: In CloudFormation, nested stacks and cross-stack references can be used to achieve modularization, while Terraform is capable of creating reusable and reproducible modules.
User experience/ease of use: In contrast to CloudFormation, which is limited to AWS services, Terraform spans multiple cloud service providers such as AWS, Azure, and Google Cloud Platform, among others. This flexibility allows Terraform to provide a unified approach to managing cloud infrastructure across multiple providers, making it a popular choice for organizations that use more than one cloud provider.
Life cycle and state management: CloudFormation stores the state and manages it with the use of stacks. Terraform stores the state on disk in JSON format and allows you to use a remote state system, such as an AWS S3 bucket, that gives you the capability of tracking versions.
Import from existing infrastructure: It is possible to import resources into CloudFormation, but only a few resources are supported. It is possible to import all resources into Terraform state, but it does not generate configuration in the process; you need to handle that. But there are third-party tools that can generate configuration, too.
Verification steps: CloudFormation uses change sets to verify the required changes. Terraform has a powerful plan for identifying changes and allows you to verify your changes to existing infrastructure before applying them.
Rolling updates and rollbacks: CloudFormation automatically rolls back to the last working state. Terraform has no feature for rolling updates or rollbacks, but you can build a rollback system using a CI/CD system.
Multi-cloud management: CloudFormation is AWS-only, but Terraform supports multiple cloud providers and many more services.
Compliance integration: CloudFormation is built by AWS, so compliance is already assured, but for Terraform, you need to implement third-party tools yourself to achieve compliance.
Deployment type: CloudFormation has a built-in CI/CD system that takes care of everything concerning deployment and rollbacks. Terraform can be deployed from any system, but you need to build your CI/CD workflow or adopt a service that can fill the gaps.
Drift detection: Both tools have drift detection by default.
Cost: Using AWS CloudFormation does not incur any additional charges beyond the cost of the AWS resources that are created, such as Amazon EC2 instances or Elastic Load Balancing load balancers. In contrast, Terraform is an open source project that can be used free of charge. However, to obtain enterprise-level features such as CI/CD automation and state management, you may need to consider using additional services and systems provided by HashiCorp or third-party service providers. These additional services may come with their own costs.

Terraform - Part 2

April 12, 2025 0

 

Terraform Workflow:
Terraform workflows consist of five fundamental steps:


Write - Create  a module of your code
Init - Initialize your code with download of required plugins of provider.
Plan - Review and predict the changes and determine whether to accept this changes.
Apply - Implement the changes in the real environment.
Destroy - Destroying the infra structure which we created.




We can validated the file format through terraform fmt command.

[root@thiru project]# terraform fmt main.tf
│ Error: Invalid multi-line string
│   on main.tf line 15:
│   15: resource "aws_instance" "Web_server {
│   16:   ami =
│ Quoted strings may not be split over multiple lines. To produce a multi-line string, either use the \n escape to represent a newline character or use the
│ "heredoc" multi-line template syntax.

╷[root@thiru project]# terraform fmt main.tf
[root@thiru project]#














Friday, April 4, 2025

Ansible - Ansible Tower

April 04, 2025 0

 

Ansible Tower:
Ansible Tower is a web based platform that makes working with Ansible easier in large-scale environments, Ansible tower has renamed as Ansible automation platform in the latest version.
Ansible Automation platform has classified as below:
  • Event Driven Ansible controller - It triggers playbooks on specific events or react on specific events.
  • Ansible Automation Hub - It integrated platform to manage Ansible Content Collections.
  • Ansible Light Speed AI [Required separate subscription]

* Installation is controlled by inventory file.  Inventory files define the hosts and containers created and variables on it.
Managing machines with Tower
Managed machines with Tower is similar things from managing machines with Ansible from the command line.
Identify the managing machine from Tower
* setup the /etc/hosts to resolve the DNS of managed machines.
We need to ensure that below setups are in places in the managed machines.
* Ensure sshd is running and accept the incoming connection from the firewall
* Need a user account with Sudo privileges
* Need to enable the password less connection between Tower and managed servers.
Ansible Tower components:
* Organization - It is a collection of managed devices
* Users - Administrative users that can be granted access to specific tasks
* Inventories - Managed servers. It can be created statically or dynamically.
* Credentials - Credentials that are used to log into managed machines (like AWS or cloud credentials)
* Project - It is a collection of playbooks obtained from a certain location (ex. GIT)
* Template - The job definition with all of its parameters. It must be launched or scheduled.
Setup the project in AWX:
We need to follow up below steps to create our first project under AWX.
1) Create a organization
2) Create a inventory 
3) Configure a credentials
4) Setup the project
5) Define a Job Template
6) Run the Job 
Created a Inventory:
Login into AWX console and navigate into Inventories under Resources.


Create a Host file under Inventory:


Create a credentials:
Navigate into Credentials under resources from AWX GUI.

Click on create Credential and define a username and password which is used to maintain a resources and Templates.
Create a Project:
Navigate into Project under resources.
Define a project name and select the AWX environment and GIT repo.

Create a work flow Template for more than 1 job.


.
Submit a job and monitor it. We can able to schedule a job for particular time as well.





Wednesday, April 2, 2025

GCP - VPC - part 2

April 02, 2025 0

 

VPC - Virtual Private Cloud

VPC has classified two types in the GCP.

Auto Mode : It is a default VPC in the GCP. The network has configured by automatically and firewall has been pre-configured as well. We should not use this Mode into Production environment.

Custom Mode : The IP allocation and firewall setup needs to take care by us. It is safe and secure setup for the production environment.

Subnet is playing a vital role in the VPC network.



The project can communicate from Subnet A & B  across the Regions through internal networks. C & D needs to communicate through external network even though both are belongs into same region.



Firewall Rule Configuration:


Load Balancing:


Application Load Balancer:

Proxy Load Balancer: