Aditya Prasath Ravilla Shridhar PrasadAditya PrasathSystems · GPU · AIContact

Aditya Prasath

Education: M.S. Computer Science @ University of Illinois Urbana-Champaign
Current role: AI Engineering Intern @ StitchStudio
Technology focus: Distributed systems · GPU compute · AI infrastructure · production LLMs

I build backends that survive real traffic, CUDA stacks that earn their speedup, and LLM systems that ship in production—not demos.

Download resume apr11@illinois.edu

What I research

Published work →

Project · GPT-2 Inference Engine

KV-cache shifts the bottleneck in transformer inference

On A40, the largest gains came from not recomputing past keys and values—memory traffic dropped before raw FLOPs did.

CUDA · FlashAttention-2 · Nsight Compute

Read finding →

Project · ThinkerCUDA

Occupancy wins when tiles fit the SM, not when blocks look big

ThinkerCUDA taught me that oversized thread blocks can hide latency on paper—and lose on real hardware.

CUDA · C++ · HPC

Read finding →

Course · UIUC CS 598 · PACT

Agents over-deliberate—pruning calls beats bigger models

PACT research: a lightweight speculator can truncate redundant LLM turns without tanking task accuracy.

LLM agents · DPO · Phi-3-mini

Read finding →

Research · Intelligent Surveillance over 5G Edge

Edge inference is a bandwidth contract, not a compute flex

Our 5G surveillance work showed when to infer on-device—and when centralized GPUs still win.

Edge AI · 5G · Real-time inference

Read finding →

PythonC++JavaJavaScriptBashSQLHigh-throughput servicesCUDA / GPU kernelsKafkaQuery optimizationPrometheus · GrafanaAWSKubernetesDockerMicroservicesRESTLlama 3 · DeepSeekLangChainRAG · FAISSGPT-2Agentic EngineeringSparkHDFSETLPythonC++JavaJavaScriptBashSQLHigh-throughput servicesCUDA / GPU kernelsKafkaQuery optimizationPrometheus · GrafanaAWSKubernetesDockerMicroservicesRESTLlama 3 · DeepSeekLangChainRAG · FAISSGPT-2Agentic EngineeringSparkHDFSETL

Experience

Total Industry Experience (3+ Years)

Production software across telecom, AI platforms, and chartered accountancy workflow automation.

May 2026 – Present

AI Engineering Intern

StitchStudio · Chicago, IL · Remote

Client: Internal Project

Building LangChain agents on Llama 3 with prompt-routing and state-machine orchestration.
RAG with FAISS—sub-100ms retrieval over million-scale embeddings.

LLMsLangChainRAGFAISS

Aug 2023 – Jul 2025

Software Engineer

Cognizant Technology Solutions · Tamil Nadu, India

Client: Verizon

Owned microservices at 8M+ requests/day; 99.95% availability with circuit breakers and autoscaling.
Cut end-to-end latency 42% with async Kafka and SQL plan optimization.
Reduced incidents 35% via Prometheus/Grafana; raised test coverage 25%.

KafkaMicroservicesPrometheusPostgreSQL

Sep 2022 – Dec 2022

Software Analyst Intern

BSP & CO. · Tamil Nadu, India

Client: Internal Project

Owned automation initiatives for manual practitioner workflows at a chartered accountancy firm—scoped bottlenecks with partners and shipped pipeline tooling end to end.
Built workflow automations that replaced repetitive manual steps in client reporting and compliance prep, improving turnaround time and consistency.
Documented process maps and handoff runbooks so firm staff could operate and extend automations without engineering support.

Workflow AutomationPythonProcess Design

Sep 2021 – Aug 2022

Backend Platform Intern

SMZ & CO. · Kuala Lumpur · Remote

Client: Internal Project

Owned backend automation for manual task pipelines at a CA firm—translating spreadsheet-driven audit and reporting work into production APIs and data flows.
Designed REST services and PostgreSQL schemas for 100K+ records/month; reduced reporting latency from minutes to sub-second queries.
Worked directly with practitioners to map firm workflows, then delivered integrations that cut repetitive manual effort across deliverables.

PostgreSQLRESTWorkflow AutomationAPI Design

Selected work

Hands-on projects across inference, GPU kernels, agents, and cloud-native platforms.

GPT-2 Inference Engine

GPT-2 forward pass from scratch: FlashAttention-2, KV-cache, and memory tiling profiled on NVIDIA A40.

ThinkerCUDA

3D convolution and tiled matmul kernels—6× throughput over CPU through coalescing and occupancy tuning.

Audit Orchestrator

Agentic audit workflows with BMAD-METHOD; routes prompts by task complexity for throughput and accuracy.

Auto-Scaling Platform

EKS microservices with ALB autoscaling; blue-green and canary releases for zero-downtime deploys.

Research

Peer-reviewed edge AI and in-progress work on faster, leaner LLM agents.

UIUC CS 598 · In progress

PACT: Pruned Agent Call Throughput

Speculator model (Phi-3-mini) trims LLM over-deliberation; DPO alignment balances accuracy and latency.

LLM AgentsDPOSpeculative Decoding

Conference paper

Intelligent Surveillance over 5G Edge

Optimized real-time inference across edge and cloud—latency and bandwidth under 5G constraints.

Edge AI5GInference

Technical range

Languages and platforms I reach for when performance and reliability both matter.

Languages

Daily drivers

Python
C++
Java
JavaScript
Bash
SQL

Systems

Scale & reliability

High-throughput services
CUDA / GPU kernels
Kafka
Query optimization
Prometheus · Grafana

Cloud

Deploy & operate

AWS
Kubernetes
Docker
Microservices
REST

AI & data

Models & pipelines

Llama 3 · DeepSeek
LangChain
RAG · FAISS
GPT-2
Agentic Engineering
Spark
HDFS
ETL

Education

In Progress

M.S. Computer Science

University of Illinois Urbana-Champaign

Aug 2025 – Present

Parallel programming · Systems for GenAI · Applied ML · Cloud · LLMs

3.97GPA

B.Tech, CS & Business Systems

SRM Institute of Science and Technology

Jun 2019 – Jul 2023

DSA · Compilers · DBMS · Networks · Automata

Awards

Dean's recognition for students who lead at scale—across academics, sport, and campus life.

Dean's Award · Undergraduate cohort 2019–2023

Outstanding Contribution Award

SRM Institute of Science and Technology · Jun 2019 – May 2023

Core strengthLeadership skills

Awarded for sustained campus impact—scaling student operations, mentoring peers, and connecting technical communities with university-wide programs.

Impact

Led operations for 1,500+ technical and cultural events
First-class distinction across undergraduate academics
State-level clusters representee, badminton

Leadership roles

Head of Operations, White Hat Hackers Club
Discipline & Logistics Head, Association of Computer Science Engineers
Student Mentor & Director, Rotaract Club of SRM Vadapalani

Get in touch

Open to full-time roles in systems, AI infrastructure, autonomy, and GPU computing.

Email me Download resume

+1 217-249-4900 · Champaign, IL