Prasad Z
Subscribe
Sign in
Home
Notes
Archive
About
From Tokens to Tensors: An Engineer's Deep Dive into LLM Inference Performance
Moving beyond the API: How understanding the silicon-level processing can shave seconds off your inference latency
Apr 26
•
Prasad Z
February 2022
How to deploy Kubeflow on AWS EKS
This article is aimed at anyone who is trying to install Kubeflow on AWS.
Feb 4, 2022
•
Prasad Z
December 2021
MLOps: Build ML lifecycle from scratch
Following continuous software engineering practices, there has been an increasing interest in rapid deployment of machine learning (ML) features, called…
Dec 15, 2021
•
Prasad Z
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts