Monday, April 6, 2026

Optimizing AI models for Production

April 06, 2026 0

 



We can LLMs in three ways by usually

1. Encode text into semantic vectors with little/no file tuning
2. Fine tune a pre-trained LLM to perform a very specific task using by Transfer Learning
3. Query an LLM to solve a task which was pre-trained or could intuit.
Two types of LLMs now.
1) Auto encoding LLMs - Learn a entire sequence by predicting tokens (words) given past and future context.   It is best for classification and embedding + retrieval tasks. [Example BERT]
2) Auto regressive LLMs : It will predict a future token 
LLMs excel at task that require reasoning using context and input information in the conjunction to produce a nuanced answer.


AI agents are semi autonomous systems that interact with environment, make decisions and perform tasks on behalf of users.
Autonomy - They can perform tasks without continuous human intervention.
Decision Making - Use data to analyze and choose actions
Adaptability - Learn and improve over time with feedback.

Saturday, April 4, 2026

Deep Dive of Kubernetes Network

April 04, 2026 0

 

K8S is a dynamic network. Pods are ephemeral.  IP change on every restart.

Containers with in the pod shared a single network namespace.

K8S networking Model:

1) Every Pod receive a unique and cluster wide IP address.

2) All pods on the same node can communicate directly without NAT

3) All pods on different nods can communicate directly without NAT

4) A Pod self seen IP is identical to the IP other pods use to reach it [Flat network]

Kubernetes specifies what is required and CNI plugins decide How to implement it

Communication pattern in K8S

Container to Container - within same pod via loopbackup [127.0.0.1]

Pod to Pod - Direct IP communication across nodes without address translation

Pod to Service - Kube proxy intercepts traffic and load balancing to healthy end points

External to Service - Exposed via NodePort, LoadBalance type or Ingress controller

Node to Pod - Kubelet and monitoring agents


Kube-Proxy:

Kube proxy runs on every node as a DaemonSet and part of the Kubernetes control plane. It watches API sever for any change of resource or end points. API server initate a end point object when selector create a resourece.  Kube proxy is maintaining a chain of IP table mode. I used to maintain local and forward routing.  IPVS is a kernel level virutal load balancer. It will handle thousand of service request and routing at a same time.

Pod to service will take care of kube proxy and pod to pod communication will take care of CNI.

CoreDNS:

CoreDNS is the cluster DNS server and deployed as a deployment in the kube system namespace.

Every pod of /etc/resolv.conf is inject to point into CoreDNS.

Pod Networking:

Each pod has an Own network namespace and fully isolated stack. The namespace contain virutal vNICs, routing table and iptable rules.

Infra [pause] container creates and own a network namespace for the pod. All application containers in the Pod share the Infra container namespace at startup.

Virtual [veth] pair : Two virtual NICs connect between Pod and Node side. One end lives inside the Pod's network namespace [eth0] and other end is attached to Node like linux bridge [cbr0]

Traffic flow : Pod [eth0] -> veth pair -> host bridge -> node routing table -> destination 

Cross Node communication:

Node to Node communication is used Overlay approach and Underlay approach.

Overlay approach is encapsulated a traffic and decapsulated from destionation node.

Underlay approach is a direct routing method.

Modern CNIs like calico & cilium will support both approach.

Overlay (VxLAN/Geneve) - It is universal compatibility and cloud friendly. It will support upto 50 bytes per packet if MTU set to 1450

Underlay - It required physical network to accept and route through BGP routing

Sunday, March 22, 2026

LLM Context management along with Cursor 2.0

March 22, 2026 0

 

Cursor 2.0 is an AI editor for Production Environment. It will be run 8 parallel agents without any issue. 

Context Management like telling story when it getting convoluted. It will direct path when AI get confused.

Context window is a windows chat where user and AI interact each other.






AI driven project initialization:

1) Describe - Describe your problem or expectation
2) Define - Stack, database, auth & deployment
3) Generate - Cursor will take care of codes



 

Tuesday, March 10, 2026

Infrastructure as Code through AI workloads

March 10, 2026 0



A Kubernetes operator is a specialized controller designed to extend Kubernetes API, enabling the management of complex application through declarative configurations.
Kubernetes Operator operate within continuous reconciliation loop. This cycle begins when user create a resource, It prompting controller to monitoring a change and take a necessary action to ensure a desire state. Operator allow user to defined a desired state of application in custom resources, while operator
controller continue reconcile the actual state with this desire state, embodying the operational expertise of human site reliability engineer.
KubeFlow : It is a primary of orchestration tool for MI workflow. It is focus on training aspect of models.

Most using of Kubernetes resources in the real time.
APIService
ClusterRole
ClusterRoldBinding
ConfigMap
CronJob
CSIDriver
CSINode
DaemonSet
Deployment
EphemeralContainers
HorizontalPodAutoscalar
Ingress
IngressClass
Job
Namespace
Node
PersistVolume
Pod
PodDisruptionBudget
PodTemplate
ReplicatSet
ResourceQuota
Role
RoleBinding
Secret
Service
ServiceAccount
StatefulSet
StorageClass
VolumeAttachment
Binding
CertificateSigningRequest
ComponentStatus
ControllerRevision
CustomResourceDef
Endpoints
EndpointSlice
LeaseReplicationController
LimitRange
LocalSubjectAccess
MutatingWebhookConfiguration
NetworkPolicy
PodSecurityPolicy
PriorityClass
RuntimeClass
SelfSubjectAccess
SelfSubjectRules
SubjectAccessReview
TokenReview
ValidatingWebhook



Monday, March 2, 2026

TLS 1.3 Cipher Suite

March 02, 2026 0

 



TLS1.3 is released in August 2018 (RFC8446).  It is a latest version of Transport Layer Protocol. It will remove a weaker algorithms and improve a speed of authentication. 

TLS 1.2 Cipher suit diagram:

TLS_DHE_RSA_WITH_AES_256_CBC_SHA
Key Exchange[DHE], Authentication [RSA], Encryption [AES_256_CBC] and Hashing [SHA]

TLS1.3 will support 5 Cipher suites compare to TLS.2 will support of multiple Cipher suites.

TLS1.3 including 5 Cipher Suites:
  • TLS_AES_128_GCM_SHA256 [Must Implement]
  • TLS_AES_256_GCM_SHA384 [Should be Implement]
  • TLS_CHACHA2-_POLY305_SHA256 [Should be implement]
  • TLS_AES_128_CCM_SHA256 [Can implement]
  • TLS_AES_128_CCM_8_SHA256 [Can implement]
It will follow up with forward secrecy [Once Encrypted always encrypted]
TLS1.3 will remove a custom DH Groups and support a standard based group only, because it will lead may insecure groups being used and breach a security.
DH means Diffi-Hellman starts with agreeing upon some values.
Approved DH groups are designated via various standards.
* Traditional DH groups : RFC 2409 & RFC 3526
* Elliptic Curve Groups : RFC 5639, FIPS 186-4

Handshake method of TLS 1.2 Vs 1.3


TLS1.2 is using 2routing method for handshake a request, but TLS1.3 is using 1 routing method for handshake method. It will improve a quick response compare to TLS1.2.
TLS1.2 is created 4 keys while handshake connection.
  • Client encryption
  • Client HMAC
  • Server Encryption
  • Server HMAC

 TLS1.3 will create a 11 keys while handshakes connection request.

TLS workflow:
* TLS/SSL will send a highest support version of client Hello and Sever Hello for handshake.
Middlebox or Load balancer will drop a request if mismatch of version upto TLS1.2.
TLS1.3 will create a header with TLS1.0 , Client hello version with TLS 1.2 and Client hello extension with TLS1.3, Hence the request will not drop off in between Middlebox.
* TLS is providing forward end-to-end handshake encryption. Handshake will create a session keys which project an application data. Session keys will be derived from RSA & DS.
DS - It will share public key store and private key will delete after SEED establishment. It will support of forward Secrecy.

Client Hello is carrying on information about Version, Session ID and Cipher suites.
Below diagram will give a details about 11 keys of TLS1.3.



Saturday, November 8, 2025

MCP - Model Context Protocol

November 08, 2025 0

 

MCP - Model Context Protocol:

MCP defined a LLM to access an external data, tools and context in a a structure way. MCP (Model Context Protocol) is an open-source standard for connecting AI applications to external systems and data.

Overview of MCP:

AI application such as Claude or chatGPT can connect to data sources, tools [search engine] and workflow [prompts] through MCP and perform a tasks.

MCP like an interface which communicated to MCP client and discover their requirement and offer available services for their requirement. 
MCP Framework:
  • MCP SDK - It is a foundation for all the MCP development. It will use for Production and standard projects. It can be integrate into any tools or transport (STDIO, SSE)
  • FASTMCP 1.0 - It became a legacy support and integrated into MCP python SDK.
  • FASTMCP 2.0 - This is a latest and modern feature tools kits for advanced MCP workflows.
  • Others Frameworks - Java SDK and third party libs in other languages.
Agent workflows inside of Memory:


RAG - Retrieval Augmented Generation
It converts a data into numerical representation where each piece of data has information about how it relates to others.
Retrieval - when user ask a question or search, RAG turns question or search into own numerical representation (Embedding) and find a data which is similar meanings.
Augmentation - The top search result are then added into prompt and send to back to LLM
Generation - The search results give the LLM some local context and consider as response.
Embedding:
Embedding represent text as set of numerical data along with tensors (different dimensions)
Each dimension will store some information about text meaning or syntactical meaning.
Each words or sentence with similar meaning are stored near by vector space.
Models will learn to place a similar words or sentences close  together in the embedded space.
Common pre-trained models such as BERT and RoBERTs are  used for generating an embedding inside of vector space.
We can able to use an embedded for NLP tasks like semantic search, text classification and sentimental analysis.
Agentic RAG:
It is integrate an AI agents to enhance the RAG approach. It will breakdown from complex queries into manageable parts and using API tools where need to augment processing and better result.


Implementation of AI agent

November 08, 2025 0


                                 

Installation of Ollama:
Ollama is an open source tool which will helps us to run a NLP [Natural Language Processing] through locally.
Step1) Downloading the Ollawa tool for your suitable operating system and installed it.