LLM Engineering

  • 1 on 1 Instructor

  • 24*7 Fully Remote Support

  • 8 Modules

  • Intermediate Level

This course is designed for aspiring AI engineers and data scientists eager to dive into the world of Generative AI and large language models (LLMs). Whether you're a beginner or a professional looking to upskill, you'll gain hands-on experience in building advanced AI applications. Expect to transform your understanding of AI technology and deploy your very own AI solutions that can significantly enhance productivity. Join us and kickstart your journey to becoming a leader in the AI field!

Full-time students: Approx. 8 weeks to finish this course

Part-time students: Approx. 16 weeks to finish this course

Course Content

  • Running Your First LLM Locally with Ollama and Open Source Models

    Spanish Tutor Demo with Open-Source Models & Course Overview

    Setting Up Your LLM Development Environment with Cursor and UV

    Setting Up Your PC Development Environment with Git and Cursor

    Mac Setup: Installing Git, Cloning the Repo, and Cursor IDE

    Installing UV and Setting Up Your Cursor Development Environment

    Setting Up Your OpenAI API Key and Environment Variables

    Installing Cursor Extensions and Setting Up Your Jupyter Notebook

    Running Your First OpenAI API Call and System vs User Prompts

    Building a Website Summarizer with OpenAI Chat Completions API

    Hands-On Exercise: Building Your First OpenAI API Call from Scratch

    LLM Engineering Building Blocks: Models, Tools & Techniques

    Your 8-Week Journey: From Chat Completions API to LLM Engineer

    Frontier Models: OpenAI GPT, Claude, Gemini & Grok Compared

    Open-Source LLMs: LLaMA, Mistral, DeepSeek, and Ollama

    Chat Completions API: HTTP Endpoints vs OpenAI Python Client

    Using the OpenAI Python Client with Multiple LLM Providers

    Running Ollama Locally with OpenAI-Compatible Endpoints

    Base, Chat, and Reasoning Models: Understanding LLM Types

    Frontier Models: GPT, Claude, Gemini & Their Strengths and Pitfalls

    Testing ChatGPT-5 and Frontier LLMs Through the Web UI

    Testing Claude, Gemini, Grok & DeepSeek with ChatGPT Deep Research

    Agentic AI in Action: Deep Research, Claude Code, and Agent Mode

    Frontier Models Showdown: Building an LLM Competition Game

    Understanding Transformers: The Architecture Behind GPT and LLMs

    From LSTMs to Transformers: Attention, Emergent Intelligence & Agentic A

    Parameters: From Millions to Trillions in GPT, LLaMA & DeepSeek

    What Are Tokens? From Characters to GPT's Tokenizer

    Understanding Tokenization: How GPT Breaks Down Text into Tokens

    Tokenizing with tiktoken and Understanding the Illusion of Memory

    Context Windows, API Costs, and Token Limits in LLMs

    Building a Sales Brochure Generator with OpenAI Chat Completions API

    Building JSON Prompts and Using OpenAI's Chat Completions API

    Chaining GPT Calls: Building an AI Company Brochure Generator

    Building a Brochure Generator with GPT-4 and Streaming Results

    Business Applications, Challenges & Building Your AI Tutor

  • Connecting to Multiple Frontier Models with APIs (OpenAI, Claude, Gemini

    Testing GPT-5 Models with Reasoning Effort and Scaling Puzzles

    Testing Claude, GPT-5, Gemini & DeepSeek on Brain Teasers

    Local Models with Ollama, Native APIs, and OpenRouter Integration

    LangChain vs LiteLLM: Choosing the Right LLM Framework

    LLM vs LLM: Building Multi-Model Conversations with OpenAI & Claude

    Building Data Science UIs with Gradio (No Front-End Skills Required)

    Building Your First Gradio Interface with Callbacks and Sharing

    Building Gradio Interfaces with Authentication and GPT Integration

    Markdown Responses and Streaming with Gradio and OpenAI

    Building Multi-Model Gradio UIs with GPT and Claude Streaming

    Building Chat UIs with Gradio: Your First Conversational AI Assistant

    Building a Streaming Chatbot with Gradio and OpenAI API

    System Prompts, Multi-Shot Prompting, and Your First Look at RAG

    How LLM Tool Calling Really Works (No Magic, Just Prompts)

    Common Use Cases for LLM Tools and Agentic AI Workflows

    Building an Airline AI Assistant with Tool Calling in OpenAI and Gradio

    Handling Multiple Tool Calls with OpenAI and Gradio

    Building Tool Calling with SQLite Database Integration

    Introduction to Agentic AI and Building Multi-Tool Workflows

    How Gradio Works: Building Web UIs from Python Code

    Building Multi-Modal Apps with DALL-E 3, Text-to-Speech, and Gradio Bloc

    Running Your Multimodal AI Assistant with Gradio and Tools

  • Introduction to Hugging Face Platform: Models, Datasets, and Spaces

    HuggingFace Libraries: Transformers, Datasets, and Hub Explained

    Introduction to Google Colab and Cloud GPUs for AI Development

    Getting Started with Google Colab: Setup, Runtime, and Free GPU Access

    Setting Up Google Colab with Hugging Face and Running Your First Model

    Running Stable Diffusion and FLUX on Google Colab GPUs

    Introduction to Hugging Face Pipelines for Quick AI Inference

    HuggingFace Pipelines API for Sentiment Analysis on Colab T4 GPU

    Named Entity Recognition, Q&A, and Hugging Face Pipeline Tasks

    Hugging Face Pipelines: Image, Audio & Diffusion Models in Colab

    Tokenizers: How LLMs Convert Text to Numbers

    Tokenizers in Action: Encoding and Decoding with Llama 3.1

    How Chat Templates Work: LLaMA Tokenizers and Special Tokens

    Comparing Tokenizers: Phi-4, DeepSeek, and QWENCoder in Action

    Deep Dive into Transformers, Quantization, and Neural Networks

    Working with Hugging Face Transformers Low-Level API and Quantization

    Inside LLaMA: PyTorch Model Architecture and Token Embeddings

    Inside LLaMA: Decoder Layers, Attention, and Why Non-Linearity Matters

    Running Open Source LLMs: Phi, Gemma, Qwen & DeepSeek with Hugging Face

    Visualizing Token-by-Token Inference in GPT Models

    Building Meeting Minutes from Audio with Whisper and Google Colab

    Building Meeting Minutes with OpenAI Whisper and LLaMA 3.2

    Week 3 Wrap-Up: Build a Synthetic Data Generator with Open Source Models

  • Choosing the Right LLM: Model Selection Strategy and Basics

    The Chinchilla Scaling Law: Parameters, Training Data and Why It Matters

    Understanding AI Model Benchmarks: GPQA, MMLU-Pro, and HLE

    Limitations of AI Benchmarks: Data Contamination and Overfitting

    Build a Connect Four Leaderboard (Reasoning Benchmark)

    Navigating AI Leaderboards: Artificial Analysis, HuggingFace & More

    Artificial Analysis Deep Dive: Model Intelligence vs Cost Comparison

    Vellum, SEAL, and LiveBench: Essential AI Model Leaderboards

    LM Arena: Blind Testing AI Models with Community Elo Ratings

    Commercial Use Cases: Automation, Augmentation & Agentic AI

    Selecting LLMs for Code Generation: Python to C++ with Cursor

    Selecting Frontier Models: GPT-5, Claude, Grok & Gemini for C++ Code Gen

    Porting Python to C++ with GPT-5: 230x Performance Speedup

    AI Coding Showdown: GPT-5 vs Claude vs Gemini vs Groq Performance

    Open Source Models for Code Generation: Qwen, DeepSeek & Ollama

    Building a Gradio UI to Test Python-to-C++ Code Conversion Models

    Qwen 3 Coder vs GPT OSS: OpenRouter Model Performance Showdown

    Model Evaluation: Technical Metrics vs Business Outcomes

    Python to Rust Code Translation: Testing Gemini 2.5 Pro with Cursor

    Porting Python to Rust: Testing GPT, Claude, and Qwen Models

    Open Source Model Wins? Rust Code Generation Speed Challenge

  • RAG Fundamentals: Leveraging External Data to Improve LLM Responses

    Building a DIY RAG System: Implementing Retrieval-Augmented Generation

    Understanding Vector Embeddings: The Key to RAG and LLM Retrieval

    Unveiling LangChain: Simplify RAG Implementation for LLM Applications

    LangChain Text Splitter Tutorial: Optimizing Chunks for RAG Systems

    Preparing for Vector Databases: OpenAI Embeddings and Chroma in RAG

    Mastering Vector Embeddings: OpenAI and Chroma for LLM Engineering

    Visualizing Embeddings: Exploring Multi-Dimensional Space with t-SNE

    Building RAG Pipelines: From Vectors to Embeddings with LangChain

    Implementing RAG Pipeline: LLM, Retriever, and Memory in LangChain

    Mastering Retrieval-Augmented Generation: Hands-On LLM Integration

    Master RAG Pipeline: Building Efficient RAG Systems

    Optimizing RAG Systems: Troubleshooting and Fixing Common Problems

    Switching Vector Stores: FAISS vs Chroma in LangChain RAG Pipelines

    Demystifying LangChain: Behind-the-Scenes of RAG Pipeline Construction

    Debugging RAG: Optimizing Context Retrieval in LangChain

    Build Your Personal AI Knowledge Worker: RAG for Productivity Boost

  • Fine-Tuning Large Language Models: From Inference to Training

    Finding and Crafting Datasets for LLM Fine-Tuning: Sources & Techniques

    Data Curation Techniques for Fine-Tuning LLMs on Product Descriptions

    Optimizing Training Data: Scrubbing Techniques for LLM Fine-Tuning

    Evaluating LLM Performance: Model-Centric vs Business-Centric Metrics

    LLM Deployment Pipeline: From Business Problem to Production Solution

    Prompting, RAG, and Fine-Tuning: When to Use Each Approach

    Productionizing LLMs: Best Practices for Deploying AI Models at Scale

    Optimizing Large Datasets for Model Training: Data Curation Strategies

    How to Create a Balanced Dataset for LLM Training: Curation Techniques

    Finalizing Dataset Curation: Analyzing Price-Description Correlations

    How to Create and Upload a High-Quality Dataset on HuggingFace

    Feature Engineering and Bag of Words: Building ML Baselines for NLP

    Baseline Models in ML: Implementing Simple Prediction Functions

    Feature Engineering Techniques for Amazon Product Price Prediction Models

    Optimizing LLM Performance: Advanced Feature Engineering Strategies

    Linear Regression for LLM Fine-Tuning: Baseline Model Comparison

    Bag of Words NLP: Implementing Count Vectorizer for Text Analysis in ML

    Support Vector Regression vs Random Forest: Machine Learning Face-Off

    Comparing Traditional ML Models: From Random to Random Forest

    Evaluating Frontier Models: Comparing Performance to Baseline Frameworks

    Human vs AI: Evaluating Price Prediction Performance in Frontier Models

    GPT-4o Mini: Frontier AI Model Evaluation for Price Estimation Tasks

    Comparing GPT-4 and Claude: Model Performance in Price Prediction Tasks

    Frontier AI Capabilities: LLMs Outperforming Traditional ML Models

    Fine-Tuning LLMs with OpenAI: Preparing Data, Training, and Evaluation

    How to Prepare JSONL Files for Fine-Tuning Large Language Models (LLMs)

    Step-by-Step Guide: Launching GPT Fine-Tuning Jobs with OpenAI API

    Fine-Tuning LLMs: Track Training Loss & Progress with Weights & Biases

    Evaluating Fine-Tuned LLMs Metrics: Analyzing Training & Validation Loss

    LLM Fine-Tuning Challenges: When Model Performance Doesn't Improve

    Fine-Tuning Frontier LLMs: Challenges & Best Practices for Optimization

  • Mastering Parameter-Efficient Fine-Tuning: LoRa, QLoRA & Hyperparameters

    Introduction to LoRA Adaptors: Low-Rank Adaptation Explained

    QLoRA: Quantization for Efficient Fine-Tuning of Large Language Models

    Optimizing LLMs: R, Alpha, and Target Modules in QLoRA Fine-Tuning

    Parameter-Efficient Fine-Tuning: PEFT for LLMs with Hugging Face

    How to Quantize LLMs: Reducing Model Size with 8-bit Precision

    Double Quantization & NF4: Advanced Techniques for 4-Bit LLM Optimization

    Exploring PEFT Models: The Role of LoRA Adapters in LLM Fine-Tuning

    Model Size Summary: Comparing Quantized and Fine-Tuned Models

    How to Choose the Best Base Model for Fine-Tuning Large Language Models

    Selecting the Best Base Model: Analyzing HuggingFace's LLM Leaderboard

    Exploring Tokenizers: Comparing LLAMA, QWEN, and Other LLM Models

    Optimizing LLM Performance: Loading and Tokenizing Llama 3.1 Base Model

    Quantization Impact on LLMs: Analyzing Performance Metrics and Errors

    Comparing LLMs: GPT-4 vs LLAMA 3.1 in Parameter-Efficient Tuning

    QLoRA Hyperparameters: Mastering Fine-Tuning for Large Language Models

    Understanding Epochs and Batch Sizes in Model Training

    Learning Rate, Gradient Accumulation, and Optimizers Explained

    Setting Up the Training Process for Fine-Tuning

    Configuring SFTTrainer for 4-Bit Quantized LoRA Fine-Tuning of LLMs

    Fine-Tuning LLMs: Launching the Training Process with QLoRA

    Monitoring and Managing Training with Weights & Biases

    Keeping Training Costs Low: Efficient Fine-Tuning Strategies

    Efficient Fine-Tuning: Using Smaller Datasets for QLoRA Training

    Visualizing LLM Fine-Tuning Progress with Weights and Biases Charts

    Advanced Weights & Biases Tools and Model Saving on Hugging Face

    End-to-End LLM Fine-Tuning: From Problem Definition to Trained Model

    The Four Steps in LLM Training: From Forward Pass to Optimization

    QLoRA Training Process: Forward Pass, Backward Pass and Loss Calculation

    Understanding Softmax and Cross-Entropy Loss in Model Training

    Monitoring Fine-Tuning: Weights & Biases for LLM Training Analysis

    Revisiting the Podium: Comparing Model Performance Metrics

    Evaluation of our Proprietary, Fine-Tuned LLM against Business Metrics

    Visualization of Results: Did We Beat GPT-4?

    Hyperparameter Tuning for LLMs: Improving Model Accuracy with PEFT

  • From Fine-Tuning to Multi-Agent Systems: Next-Level LLM Engineering

    Building a Multi-Agent AI Architecture for Automated Deal Finding Systems

    Unveiling Modal: Deploying Serverless Models to the Cloud

    LLAMA on the Cloud: Running Large Models Efficiently

    Building a Serverless AI Pricing API: Step-by-Step Guide with Modal

    Multiple Production Models Ahead: Preparing for Advanced RAG Solutions

    Implementing Agentic Workflows: Frontier Models and Vector Stores in RAG

    Building a Massive Chroma Vector Datastore for Advanced RAG Pipelines

    Visualizing Vector Spaces: Advanced RAG Techniques for Data Exploration

    3D Visualization Techniques for RAG: Exploring Vector Embeddings

    Finding Similar Products: Building a RAG Pipeline without LangChain

    RAG Pipeline Implementation: Enhancing LLMs with Retrieval Techniques

    Random Forest Regression: Using Transformers & ML for Price Prediction

    Building an Ensemble Model: Combining LLM, RAG, and Random Forest

    Wrap-Up: Finalizing Multi-Agent Systems and RAG Integration

    Enhancing AI Agents with Structured Outputs: Pydantic & BaseModel Guide

    Scraping RSS Feeds: Building an AI-Powered Deal Selection System

    Structured Outputs in AI: Implementing GPT-4 for Detailed Deal Selection

    Optimizing AI Workflows: Refining Prompts for Accurate Price Recognition

    Mastering Autonomous Agents: Designing Multi-Agent AI Workflows

    The 5 Hallmarks of Agentic AI: Autonomy, Planning, and Memory

    Building an Agentic AI System: Integrating Pushover for Notifications

    Implementing Agentic AI: Creating a Planning Agent for Automated Workflows

    Building an Agent Framework: Connecting LLMs and Python Code

    Completing Agentic Workflows: Scaling for Business Applications

    Autonomous AI Agents: Building Intelligent Systems Without Human Input

    AI Agents with Gradio: Advanced UI Techniques for Autonomous Systems

    Finalizing the Gradio UI for Our Agentic AI Solution

    Enhancing AI Agent UI: Gradio Integration for Real-Time Log Visualization

    Analyzing Results: Monitoring Agent Framework Performance

    AI Project Retrospective: 8-Week Journey to Becoming an LLM Engineer

¥ 23800

¥ 28800 17% off

($ 3299)

Enroll Now