LLM Engineering
1 on 1 Instructor
24*7 Fully Remote Support
8 Modules
Intermediate Level
This course is designed for aspiring AI engineers and data scientists eager to dive into the world of Generative AI and large language models (LLMs). Whether you're a beginner or a professional looking to upskill, you'll gain hands-on experience in building advanced AI applications. Expect to transform your understanding of AI technology and deploy your very own AI solutions that can significantly enhance productivity. Join us and kickstart your journey to becoming a leader in the AI field!
Full-time students: Approx. 8 weeks to finish this course
Part-time students: Approx. 16 weeks to finish this course
Course Content
-
Running Your First LLM Locally with Ollama and Open Source Models
Spanish Tutor Demo with Open-Source Models & Course Overview
Setting Up Your LLM Development Environment with Cursor and UV
Setting Up Your PC Development Environment with Git and Cursor
Mac Setup: Installing Git, Cloning the Repo, and Cursor IDE
Installing UV and Setting Up Your Cursor Development Environment
Setting Up Your OpenAI API Key and Environment Variables
Installing Cursor Extensions and Setting Up Your Jupyter Notebook
Running Your First OpenAI API Call and System vs User Prompts
Building a Website Summarizer with OpenAI Chat Completions API
Hands-On Exercise: Building Your First OpenAI API Call from Scratch
LLM Engineering Building Blocks: Models, Tools & Techniques
Your 8-Week Journey: From Chat Completions API to LLM Engineer
Frontier Models: OpenAI GPT, Claude, Gemini & Grok Compared
Open-Source LLMs: LLaMA, Mistral, DeepSeek, and Ollama
Chat Completions API: HTTP Endpoints vs OpenAI Python Client
Using the OpenAI Python Client with Multiple LLM Providers
Running Ollama Locally with OpenAI-Compatible Endpoints
Base, Chat, and Reasoning Models: Understanding LLM Types
Frontier Models: GPT, Claude, Gemini & Their Strengths and Pitfalls
Testing ChatGPT-5 and Frontier LLMs Through the Web UI
Testing Claude, Gemini, Grok & DeepSeek with ChatGPT Deep Research
Agentic AI in Action: Deep Research, Claude Code, and Agent Mode
Frontier Models Showdown: Building an LLM Competition Game
Understanding Transformers: The Architecture Behind GPT and LLMs
From LSTMs to Transformers: Attention, Emergent Intelligence & Agentic A
Parameters: From Millions to Trillions in GPT, LLaMA & DeepSeek
What Are Tokens? From Characters to GPT's Tokenizer
Understanding Tokenization: How GPT Breaks Down Text into Tokens
Tokenizing with tiktoken and Understanding the Illusion of Memory
Context Windows, API Costs, and Token Limits in LLMs
Building a Sales Brochure Generator with OpenAI Chat Completions API
Building JSON Prompts and Using OpenAI's Chat Completions API
Chaining GPT Calls: Building an AI Company Brochure Generator
Building a Brochure Generator with GPT-4 and Streaming Results
Business Applications, Challenges & Building Your AI Tutor
-
Connecting to Multiple Frontier Models with APIs (OpenAI, Claude, Gemini
Testing GPT-5 Models with Reasoning Effort and Scaling Puzzles
Testing Claude, GPT-5, Gemini & DeepSeek on Brain Teasers
Local Models with Ollama, Native APIs, and OpenRouter Integration
LangChain vs LiteLLM: Choosing the Right LLM Framework
LLM vs LLM: Building Multi-Model Conversations with OpenAI & Claude
Building Data Science UIs with Gradio (No Front-End Skills Required)
Building Your First Gradio Interface with Callbacks and Sharing
Building Gradio Interfaces with Authentication and GPT Integration
Markdown Responses and Streaming with Gradio and OpenAI
Building Multi-Model Gradio UIs with GPT and Claude Streaming
Building Chat UIs with Gradio: Your First Conversational AI Assistant
Building a Streaming Chatbot with Gradio and OpenAI API
System Prompts, Multi-Shot Prompting, and Your First Look at RAG
How LLM Tool Calling Really Works (No Magic, Just Prompts)
Common Use Cases for LLM Tools and Agentic AI Workflows
Building an Airline AI Assistant with Tool Calling in OpenAI and Gradio
Handling Multiple Tool Calls with OpenAI and Gradio
Building Tool Calling with SQLite Database Integration
Introduction to Agentic AI and Building Multi-Tool Workflows
How Gradio Works: Building Web UIs from Python Code
Building Multi-Modal Apps with DALL-E 3, Text-to-Speech, and Gradio Bloc
Running Your Multimodal AI Assistant with Gradio and Tools
-
Introduction to Hugging Face Platform: Models, Datasets, and Spaces
HuggingFace Libraries: Transformers, Datasets, and Hub Explained
Introduction to Google Colab and Cloud GPUs for AI Development
Getting Started with Google Colab: Setup, Runtime, and Free GPU Access
Setting Up Google Colab with Hugging Face and Running Your First Model
Running Stable Diffusion and FLUX on Google Colab GPUs
Introduction to Hugging Face Pipelines for Quick AI Inference
HuggingFace Pipelines API for Sentiment Analysis on Colab T4 GPU
Named Entity Recognition, Q&A, and Hugging Face Pipeline Tasks
Hugging Face Pipelines: Image, Audio & Diffusion Models in Colab
Tokenizers: How LLMs Convert Text to Numbers
Tokenizers in Action: Encoding and Decoding with Llama 3.1
How Chat Templates Work: LLaMA Tokenizers and Special Tokens
Comparing Tokenizers: Phi-4, DeepSeek, and QWENCoder in Action
Deep Dive into Transformers, Quantization, and Neural Networks
Working with Hugging Face Transformers Low-Level API and Quantization
Inside LLaMA: PyTorch Model Architecture and Token Embeddings
Inside LLaMA: Decoder Layers, Attention, and Why Non-Linearity Matters
Running Open Source LLMs: Phi, Gemma, Qwen & DeepSeek with Hugging Face
Visualizing Token-by-Token Inference in GPT Models
Building Meeting Minutes from Audio with Whisper and Google Colab
Building Meeting Minutes with OpenAI Whisper and LLaMA 3.2
Week 3 Wrap-Up: Build a Synthetic Data Generator with Open Source Models
-
Choosing the Right LLM: Model Selection Strategy and Basics
The Chinchilla Scaling Law: Parameters, Training Data and Why It Matters
Understanding AI Model Benchmarks: GPQA, MMLU-Pro, and HLE
Limitations of AI Benchmarks: Data Contamination and Overfitting
Build a Connect Four Leaderboard (Reasoning Benchmark)
Navigating AI Leaderboards: Artificial Analysis, HuggingFace & More
Artificial Analysis Deep Dive: Model Intelligence vs Cost Comparison
Vellum, SEAL, and LiveBench: Essential AI Model Leaderboards
LM Arena: Blind Testing AI Models with Community Elo Ratings
Commercial Use Cases: Automation, Augmentation & Agentic AI
Selecting LLMs for Code Generation: Python to C++ with Cursor
Selecting Frontier Models: GPT-5, Claude, Grok & Gemini for C++ Code Gen
Porting Python to C++ with GPT-5: 230x Performance Speedup
AI Coding Showdown: GPT-5 vs Claude vs Gemini vs Groq Performance
Open Source Models for Code Generation: Qwen, DeepSeek & Ollama
Building a Gradio UI to Test Python-to-C++ Code Conversion Models
Qwen 3 Coder vs GPT OSS: OpenRouter Model Performance Showdown
Model Evaluation: Technical Metrics vs Business Outcomes
Python to Rust Code Translation: Testing Gemini 2.5 Pro with Cursor
Porting Python to Rust: Testing GPT, Claude, and Qwen Models
Open Source Model Wins? Rust Code Generation Speed Challenge
-
RAG Fundamentals: Leveraging External Data to Improve LLM Responses
Building a DIY RAG System: Implementing Retrieval-Augmented Generation
Understanding Vector Embeddings: The Key to RAG and LLM Retrieval
Unveiling LangChain: Simplify RAG Implementation for LLM Applications
LangChain Text Splitter Tutorial: Optimizing Chunks for RAG Systems
Preparing for Vector Databases: OpenAI Embeddings and Chroma in RAG
Mastering Vector Embeddings: OpenAI and Chroma for LLM Engineering
Visualizing Embeddings: Exploring Multi-Dimensional Space with t-SNE
Building RAG Pipelines: From Vectors to Embeddings with LangChain
Implementing RAG Pipeline: LLM, Retriever, and Memory in LangChain
Mastering Retrieval-Augmented Generation: Hands-On LLM Integration
Master RAG Pipeline: Building Efficient RAG Systems
Optimizing RAG Systems: Troubleshooting and Fixing Common Problems
Switching Vector Stores: FAISS vs Chroma in LangChain RAG Pipelines
Demystifying LangChain: Behind-the-Scenes of RAG Pipeline Construction
Debugging RAG: Optimizing Context Retrieval in LangChain
Build Your Personal AI Knowledge Worker: RAG for Productivity Boost
-
Fine-Tuning Large Language Models: From Inference to Training
Finding and Crafting Datasets for LLM Fine-Tuning: Sources & Techniques
Data Curation Techniques for Fine-Tuning LLMs on Product Descriptions
Optimizing Training Data: Scrubbing Techniques for LLM Fine-Tuning
Evaluating LLM Performance: Model-Centric vs Business-Centric Metrics
LLM Deployment Pipeline: From Business Problem to Production Solution
Prompting, RAG, and Fine-Tuning: When to Use Each Approach
Productionizing LLMs: Best Practices for Deploying AI Models at Scale
Optimizing Large Datasets for Model Training: Data Curation Strategies
How to Create a Balanced Dataset for LLM Training: Curation Techniques
Finalizing Dataset Curation: Analyzing Price-Description Correlations
How to Create and Upload a High-Quality Dataset on HuggingFace
Feature Engineering and Bag of Words: Building ML Baselines for NLP
Baseline Models in ML: Implementing Simple Prediction Functions
Feature Engineering Techniques for Amazon Product Price Prediction Models
Optimizing LLM Performance: Advanced Feature Engineering Strategies
Linear Regression for LLM Fine-Tuning: Baseline Model Comparison
Bag of Words NLP: Implementing Count Vectorizer for Text Analysis in ML
Support Vector Regression vs Random Forest: Machine Learning Face-Off
Comparing Traditional ML Models: From Random to Random Forest
Evaluating Frontier Models: Comparing Performance to Baseline Frameworks
Human vs AI: Evaluating Price Prediction Performance in Frontier Models
GPT-4o Mini: Frontier AI Model Evaluation for Price Estimation Tasks
Comparing GPT-4 and Claude: Model Performance in Price Prediction Tasks
Frontier AI Capabilities: LLMs Outperforming Traditional ML Models
Fine-Tuning LLMs with OpenAI: Preparing Data, Training, and Evaluation
How to Prepare JSONL Files for Fine-Tuning Large Language Models (LLMs)
Step-by-Step Guide: Launching GPT Fine-Tuning Jobs with OpenAI API
Fine-Tuning LLMs: Track Training Loss & Progress with Weights & Biases
Evaluating Fine-Tuned LLMs Metrics: Analyzing Training & Validation Loss
LLM Fine-Tuning Challenges: When Model Performance Doesn't Improve
Fine-Tuning Frontier LLMs: Challenges & Best Practices for Optimization
-
Mastering Parameter-Efficient Fine-Tuning: LoRa, QLoRA & Hyperparameters
Introduction to LoRA Adaptors: Low-Rank Adaptation Explained
QLoRA: Quantization for Efficient Fine-Tuning of Large Language Models
Optimizing LLMs: R, Alpha, and Target Modules in QLoRA Fine-Tuning
Parameter-Efficient Fine-Tuning: PEFT for LLMs with Hugging Face
How to Quantize LLMs: Reducing Model Size with 8-bit Precision
Double Quantization & NF4: Advanced Techniques for 4-Bit LLM Optimization
Exploring PEFT Models: The Role of LoRA Adapters in LLM Fine-Tuning
Model Size Summary: Comparing Quantized and Fine-Tuned Models
How to Choose the Best Base Model for Fine-Tuning Large Language Models
Selecting the Best Base Model: Analyzing HuggingFace's LLM Leaderboard
Exploring Tokenizers: Comparing LLAMA, QWEN, and Other LLM Models
Optimizing LLM Performance: Loading and Tokenizing Llama 3.1 Base Model
Quantization Impact on LLMs: Analyzing Performance Metrics and Errors
Comparing LLMs: GPT-4 vs LLAMA 3.1 in Parameter-Efficient Tuning
QLoRA Hyperparameters: Mastering Fine-Tuning for Large Language Models
Understanding Epochs and Batch Sizes in Model Training
Learning Rate, Gradient Accumulation, and Optimizers Explained
Setting Up the Training Process for Fine-Tuning
Configuring SFTTrainer for 4-Bit Quantized LoRA Fine-Tuning of LLMs
Fine-Tuning LLMs: Launching the Training Process with QLoRA
Monitoring and Managing Training with Weights & Biases
Keeping Training Costs Low: Efficient Fine-Tuning Strategies
Efficient Fine-Tuning: Using Smaller Datasets for QLoRA Training
Visualizing LLM Fine-Tuning Progress with Weights and Biases Charts
Advanced Weights & Biases Tools and Model Saving on Hugging Face
End-to-End LLM Fine-Tuning: From Problem Definition to Trained Model
The Four Steps in LLM Training: From Forward Pass to Optimization
QLoRA Training Process: Forward Pass, Backward Pass and Loss Calculation
Understanding Softmax and Cross-Entropy Loss in Model Training
Monitoring Fine-Tuning: Weights & Biases for LLM Training Analysis
Revisiting the Podium: Comparing Model Performance Metrics
Evaluation of our Proprietary, Fine-Tuned LLM against Business Metrics
Visualization of Results: Did We Beat GPT-4?
Hyperparameter Tuning for LLMs: Improving Model Accuracy with PEFT
-
From Fine-Tuning to Multi-Agent Systems: Next-Level LLM Engineering
Building a Multi-Agent AI Architecture for Automated Deal Finding Systems
Unveiling Modal: Deploying Serverless Models to the Cloud
LLAMA on the Cloud: Running Large Models Efficiently
Building a Serverless AI Pricing API: Step-by-Step Guide with Modal
Multiple Production Models Ahead: Preparing for Advanced RAG Solutions
Implementing Agentic Workflows: Frontier Models and Vector Stores in RAG
Building a Massive Chroma Vector Datastore for Advanced RAG Pipelines
Visualizing Vector Spaces: Advanced RAG Techniques for Data Exploration
3D Visualization Techniques for RAG: Exploring Vector Embeddings
Finding Similar Products: Building a RAG Pipeline without LangChain
RAG Pipeline Implementation: Enhancing LLMs with Retrieval Techniques
Random Forest Regression: Using Transformers & ML for Price Prediction
Building an Ensemble Model: Combining LLM, RAG, and Random Forest
Wrap-Up: Finalizing Multi-Agent Systems and RAG Integration
Enhancing AI Agents with Structured Outputs: Pydantic & BaseModel Guide
Scraping RSS Feeds: Building an AI-Powered Deal Selection System
Structured Outputs in AI: Implementing GPT-4 for Detailed Deal Selection
Optimizing AI Workflows: Refining Prompts for Accurate Price Recognition
Mastering Autonomous Agents: Designing Multi-Agent AI Workflows
The 5 Hallmarks of Agentic AI: Autonomy, Planning, and Memory
Building an Agentic AI System: Integrating Pushover for Notifications
Implementing Agentic AI: Creating a Planning Agent for Automated Workflows
Building an Agent Framework: Connecting LLMs and Python Code
Completing Agentic Workflows: Scaling for Business Applications
Autonomous AI Agents: Building Intelligent Systems Without Human Input
AI Agents with Gradio: Advanced UI Techniques for Autonomous Systems
Finalizing the Gradio UI for Our Agentic AI Solution
Enhancing AI Agent UI: Gradio Integration for Real-Time Log Visualization
Analyzing Results: Monitoring Agent Framework Performance
AI Project Retrospective: 8-Week Journey to Becoming an LLM Engineer
¥ 23800
¥ 28800 17% off
($ 3299)
