[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-72114":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":25,"hasPages":23,"topics":26,"createdAt":10,"pushedAt":10,"updatedAt":36,"readmeContent":37,"aiSummary":38,"trendingCount":16,"starSnapshotCount":16,"syncStatus":39,"lastSyncTime":40,"discoverSource":41},72114,"LLM-Engineers-Handbook","PacktPublishing\u002FLLM-Engineers-Handbook","PacktPublishing","The LLM's practical guide: From the fundamentals to deploying advanced LLM and RAG apps to AWS using LLMOps best practices","https:\u002F\u002Fwww.amazon.com\u002FLLM-Engineers-Handbook-engineering-production\u002Fdp\u002F1836200072\u002F",null,"Python",5095,1221,59,23,0,4,19,69,12,89.66,"MIT License",false,"main",true,[27,28,29,30,31,32,33,34,35],"aws","fine-tuning-llm","genai","llm","llm-evaluation","llmops","ml-system-design","mlops","rag","2026-06-12 04:01:03","\u003Cp align='center'>\u003Ca href='https:\u002F\u002Fwww.packtpub.com\u002Fen-us\u002Funlock?step=1'>\u003Cimg src='https:\u002F\u002Fstatic.packt-cdn.com\u002Fassets\u002Fimages\u002Fpackt+events\u002FfinalGH_design_redeem.png'\u002F>\u003C\u002Fa>\u003C\u002Fp>\n\n\u003Cdiv align=\"center\">\n  \u003Ch1>👷 LLM Engineer's Handbook\u003C\u002Fh1>\n  \u003Cp class=\"tagline\">Official repository of the \u003Ca href=\"https:\u002F\u002Fwww.amazon.com\u002FLLM-Engineers-Handbook-engineering-production\u002Fdp\u002F1836200072\u002F\">LLM Engineer's Handbook\u003C\u002Fa> by \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fiusztinpaul\">Paul Iusztin\u003C\u002Fa> and \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fmlabonne\">Maxime Labonne\u003C\u002Fa>\u003C\u002Fp>\n  \u003Ca href=\"https:\u002F\u002Ftrendshift.io\u002Frepositories\u002F12257\" target=\"_blank\">\u003Cimg src=\"https:\u002F\u002Ftrendshift.io\u002Fapi\u002Fbadge\u002Frepositories\u002F12257\" alt=\"PacktPublishing%2FLLM-Engineers-Handbook | Trendshift\" style=\"width: 250px; height: 55px;\" width=\"250\" height=\"55\"\u002F>\u003C\u002Fa>\n\u003C\u002Fdiv>\n\u003C\u002Fbr>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"https:\u002F\u002Fwww.amazon.com\u002FLLM-Engineers-Handbook-engineering-production\u002Fdp\u002F1836200072\u002F\">\n    \u003Cimg src=\"images\u002Fcover_plus.png\" alt=\"Book cover\">\n  \u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  Find the book on \u003Ca href=\"https:\u002F\u002Fwww.amazon.com\u002FLLM-Engineers-Handbook-engineering-production\u002Fdp\u002F1836200072\u002F\">Amazon\u003C\u002Fa> or \u003Ca href=\"https:\u002F\u002Fwww.packtpub.com\u002Fen-us\u002Fproduct\u002Fllm-engineers-handbook-9781836200062\">Packt\u003C\u002Fa>\n\u003C\u002Fp>\n\n## 🌟 Features\n\nThe goal of this book is to create your own end-to-end LLM-based system using best practices:\n\n- 📝 Data collection & generation\n- 🔄 LLM training pipeline\n- 📊 Simple RAG system\n- 🚀 Production-ready AWS deployment\n- 🔍 Comprehensive monitoring\n- 🧪 Testing and evaluation framework\n\nYou can download and use the final trained model on [Hugging Face](https:\u002F\u002Fhuggingface.co\u002Fmlabonne\u002FTwinLlama-3.1-8B-DPO).\n\n> [!IMPORTANT]\n> The code in this GitHub repository is actively maintained and may contain updates not reflected in the book. **Always refer to this repository for the latest version of the code.**\n\n## 🔗 Dependencies\n\n### Local dependencies\n\nTo install and run the project locally, you need the following dependencies.\n\n| Tool | Version | Purpose | Installation Link |\n|------|---------|---------|------------------|\n| pyenv | ≥2.3.36 | Multiple Python versions (optional) | [Install Guide](https:\u002F\u002Fgithub.com\u002Fpyenv\u002Fpyenv?tab=readme-ov-file#installation) |\n| Python | 3.11 | Runtime environment | [Download](https:\u002F\u002Fwww.python.org\u002Fdownloads\u002F) |\n| Poetry | >= 1.8.3 and \u003C 2.0 | Package management | [Install Guide](https:\u002F\u002Fpython-poetry.org\u002Fdocs\u002F#installation) |\n| Docker | ≥27.1.1 | Containerization | [Install Guide](https:\u002F\u002Fdocs.docker.com\u002Fengine\u002Finstall\u002F) |\n| AWS CLI | ≥2.15.42 | Cloud management | [Install Guide](https:\u002F\u002Fdocs.aws.amazon.com\u002Fcli\u002Flatest\u002Fuserguide\u002Fgetting-started-install.html) |\n| Git | ≥2.44.0 | Version control | [Download](https:\u002F\u002Fgit-scm.com\u002Fdownloads) |\n\n### Cloud services\n\nThe code also uses and depends on the following cloud services. For now, you don't have to do anything. We will guide you in the installation and deployment sections on how to use them:\n\n| Service | Purpose |\n|---------|---------|\n| [HuggingFace](https:\u002F\u002Fhuggingface.com\u002F) | Model registry |\n| [Comet ML](https:\u002F\u002Fwww.comet.com\u002Fsite\u002Fproducts\u002Fopik\u002F?utm_source=llm_handbook&utm_medium=github&utm_campaign=opik) | Experiment tracker |\n| [Opik](https:\u002F\u002Fwww.comet.com\u002Fsite\u002Fproducts\u002Fopik\u002F?utm_source=llm_handbook&utm_medium=github&utm_campaign=opik) | Prompt monitoring |\n| [ZenML](https:\u002F\u002Fwww.zenml.io\u002F) | Orchestrator and artifacts layer |\n| [AWS](https:\u002F\u002Faws.amazon.com\u002F) | Compute and storage |\n| [MongoDB](https:\u002F\u002Fwww.mongodb.com\u002F) | NoSQL database |\n| [Qdrant](https:\u002F\u002Fqdrant.tech\u002F) | Vector database |\n| [GitHub Actions](https:\u002F\u002Fgithub.com\u002Ffeatures\u002Factions) | CI\u002FCD pipeline |\n\nIn the [LLM Engineer's Handbook](https:\u002F\u002Fwww.amazon.com\u002FLLM-Engineers-Handbook-engineering-production\u002Fdp\u002F1836200072\u002F), Chapter 2 will walk you through each tool. Chapters 10 and 11 provide step-by-step guides on how to set up everything you need.\n\n## 🗂️ Project Structure\n\nHere is the directory overview:\n\n```bash\n.\n├── code_snippets\u002F       # Standalone example code\n├── configs\u002F             # Pipeline configuration files\n├── llm_engineering\u002F     # Core project package\n│   ├── application\u002F    \n│   ├── domain\u002F         \n│   ├── infrastructure\u002F \n│   ├── model\u002F         \n├── pipelines\u002F           # ML pipeline definitions\n├── steps\u002F               # Pipeline components\n├── tests\u002F               # Test examples\n├── tools\u002F               # Utility scripts\n│   ├── run.py\n│   ├── ml_service.py\n│   ├── rag.py\n│   ├── data_warehouse.py\n```\n\n`llm_engineering\u002F`  is the main Python package implementing LLM and RAG functionality. It follows Domain-Driven Design (DDD) principles:\n\n- `domain\u002F`: Core business entities and structures\n- `application\u002F`: Business logic, crawlers, and RAG implementation\n- `model\u002F`: LLM training and inference\n- `infrastructure\u002F`: External service integrations (AWS, Qdrant, MongoDB, FastAPI)\n\nThe code logic and imports flow as follows: `infrastructure` → `model` → `application` → `domain`\n\n`pipelines\u002F`: Contains the ZenML ML pipelines, which serve as the entry point for all the ML pipelines. Coordinates the data processing and model training stages of the ML lifecycle.\n\n`steps\u002F`: Contains individual ZenML steps, which are reusable components for building and customizing ZenML pipelines. Steps perform specific tasks (e.g., data loading, preprocessing) and can be combined within the ML pipelines.\n\n`tests\u002F`: Covers a few sample tests used as examples within the CI pipeline.\n\n`tools\u002F`: Utility scripts used to call the ZenML pipelines and inference code:\n- `run.py`: Entry point script to run ZenML pipelines.\n- `ml_service.py`: Starts the REST API inference server.\n- `rag.py`: Demonstrates usage of the RAG retrieval module.\n- `data_warehouse.py`: Used to export or import data from the MongoDB data warehouse through JSON files.\n\n`configs\u002F`: ZenML YAML configuration files to control the execution of pipelines and steps.\n\n`code_snippets\u002F`: Independent code examples that can be executed independently.\n\n## 💻 Installation\n\n> [!NOTE]\n> If you are experiencing issues while installing and running the repository, consider checking the [Issues](https:\u002F\u002Fgithub.com\u002FPacktPublishing\u002FLLM-Engineers-Handbook\u002Fissues) GitHub section for other people who solved similar problems or directly asking us for help.\n\n### 1. Clone the Repository\n\nStart by cloning the repository and navigating to the project directory:\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002FPacktPublishing\u002FLLM-Engineers-Handbook.git\ncd LLM-Engineers-Handbook \n```\n\nNext, we have to prepare your Python environment and its adjacent dependencies. \n\n### 2. Set Up Python Environment\n\nThe project requires Python 3.11. You can either use your global Python installation or set up a project-specific version using pyenv.\n\n#### Option A: Using Global Python (if version 3.11 is installed)\n\nVerify your Python version:\n\n```bash\npython --version  # Should show Python 3.11.x\n```\n\n#### Option B: Using pyenv (recommended)\n\n1. Verify pyenv installation:\n\n```bash\npyenv --version   # Should show pyenv 2.3.36 or later\n```\n\n2. Install Python 3.11.8:\n\n```bash\npyenv install 3.11.8\n```\n\n3. Verify the installation:\n\n```bash\npython --version  # Should show Python 3.11.8\n```\n\n4. Confirm Python version in the project directory:\n\n```bash\npython --version\n# Output: Python 3.11.8\n```\n\n> [!NOTE]  \n> The project includes a `.python-version` file that automatically sets the correct Python version when you're in the project directory.\n\n### 3. Install Dependencies\n\nThe project uses Poetry for dependency management.\n\n1. Verify Poetry installation:\n\n```bash\npoetry --version  # Should show Poetry version 1.8.3 or later\n```\n\n2. Set up the project environment and install dependencies:\n\n```bash\npoetry env use 3.11\npoetry install --without aws\npoetry run pre-commit install\n```\n\nThis will:\n\n- Configure Poetry to use Python 3.11\n- Install project dependencies (excluding AWS-specific packages)\n- Set up pre-commit hooks for code verification\n\n### 4. Activate the Environment\n\nAs our task manager, we run all the scripts using [Poe the Poet](https:\u002F\u002Fpoethepoet.natn.io\u002Findex.html).\n\n1. Start a Poetry shell:\n\n```bash\npoetry shell\n```\n\n2. Run project commands using Poe the Poet:\n\n```bash\npoetry poe ...\n```\n\n\u003Cdetails>\n\u003Csummary>🔧 Troubleshooting Poe the Poet Installation\u003C\u002Fsummary>\n\n### Alternative Command Execution\n\nIf you're experiencing issues with `poethepoet`, you can still run the project commands directly through Poetry. Here's how:\n\n1. Look up the command definition in `pyproject.toml`\n2. Use `poetry run` with the underlying command\n\n#### Example:\nInstead of:\n```bash\npoetry poe local-infrastructure-up\n```\nUse the direct command from pyproject.toml:\n```bash\npoetry run \u003Cactual-command-from-pyproject-toml>\n```\nNote: All project commands are defined in the [tool.poe.tasks] section of pyproject.toml\n\u003C\u002Fdetails>\n\nNow, let's configure our local project with all the necessary credentials and tokens to run the code locally.\n\n### 5. Local Development Setup\n\nAfter you have installed all the dependencies, you must create and fill a `.env` file with your credentials to appropriately interact with other services and run the project. Setting your sensitive credentials in a `.env` file is a good security practice, as this file won't be committed to GitHub or shared with anyone else. \n\n1. First, copy our example by running the following:\n\n```bash\ncp .env.example .env # The file must be at your repository's root!\n```\n\n2. Now, let's understand how to fill in all the essential variables within the `.env` file to get you started. The following are the mandatory settings we must complete when working locally:\n\n#### OpenAI\n\nTo authenticate to OpenAI's API, you must fill out the `OPENAI_API_KEY` env var with an authentication token.\n\n```env\nOPENAI_API_KEY=your_api_key_here\n```\n\n→ Check out this [tutorial](https:\u002F\u002Fplatform.openai.com\u002Fdocs\u002Fquickstart) to learn how to provide one from OpenAI.\n\n#### Hugging Face\n\nTo authenticate to Hugging Face, you must fill out the `HUGGINGFACE_ACCESS_TOKEN` env var with an authentication token.\n\n```env\nHUGGINGFACE_ACCESS_TOKEN=your_token_here\n```\n\n→ Check out this [tutorial](https:\u002F\u002Fhuggingface.co\u002Fdocs\u002Fhub\u002Fen\u002Fsecurity-tokens) to learn how to provide one from Hugging Face.\n\n#### Comet ML & Opik\n\nTo authenticate to Comet ML (required only during training) and Opik, you must fill out the `COMET_API_KEY` env var with your authentication token.\n\n```env\nCOMET_API_KEY=your_api_key_here\n```\n\n→ Check out this [tutorial](https:\u002F\u002Fwww.comet.com\u002Fdocs\u002Fopik\u002F?utm_source=llm_handbook&utm_medium=github&utm_campaign=opik) to learn how to get started with Opik. You can also access Opik's dashboard using 🔗[this link](https:\u002F\u002Fwww.comet.com\u002Fopik?utm_source=llm_handbook&utm_medium=github&utm_content=opik).\n\n### 6. Deployment Setup\n\nWhen deploying the project to the cloud, we must set additional settings for Mongo, Qdrant, and AWS. If you are just working locally, the default values of these env vars will work out of the box. Detailed deployment instructions are available in Chapter 11 of the [LLM Engineer's Handbook](https:\u002F\u002Fwww.amazon.com\u002FLLM-Engineers-Handbook-engineering-production\u002Fdp\u002F1836200072\u002F).\n\n#### MongoDB\n\nWe must change the `DATABASE_HOST` env var with the URL pointing to your cloud MongoDB cluster.\n\n```env\nDATABASE_HOST=your_mongodb_url\n```\n\n→ Check out this [tutorial](https:\u002F\u002Fwww.mongodb.com\u002Fresources\u002Fproducts\u002Ffundamentals\u002Fmongodb-cluster-setup) to learn how to create and host a MongoDB cluster for free.\n\n#### Qdrant\n\nChange `USE_QDRANT_CLOUD` to `true`, `QDRANT_CLOUD_URL` with the URL point to your cloud Qdrant cluster, and `QDRANT_APIKEY` with its API key.\n\n```env\nUSE_QDRANT_CLOUD=true\nQDRANT_CLOUD_URL=your_qdrant_cloud_url\nQDRANT_APIKEY=your_qdrant_api_key\n```\n\n→ Check out this [tutorial](https:\u002F\u002Fqdrant.tech\u002Fdocumentation\u002Fcloud\u002Fcreate-cluster\u002F) to learn how to create a Qdrant cluster for free\n\n#### AWS\n\nFor your AWS set-up to work correctly, you need the AWS CLI installed on your local machine and properly configured with an admin user (or a user with enough permissions to create new SageMaker, ECR, and S3 resources; using an admin user will make everything more straightforward).\n\nChapter 2 provides step-by-step instructions on how to install the AWS CLI, create an admin user on AWS, and get an access key to set up the `AWS_ACCESS_KEY` and `AWS_SECRET_KEY` environment variables. If you already have an AWS admin user in place, you have to configure the following env vars in your `.env` file:\n\n```bash\nAWS_REGION=eu-central-1 # Change it with your AWS region.\nAWS_ACCESS_KEY=your_aws_access_key\nAWS_SECRET_KEY=your_aws_secret_key\n```\n\nAWS credentials are typically stored in `~\u002F.aws\u002Fcredentials`. You can view this file directly using `cat` or similar commands:\n\n```bash\ncat ~\u002F.aws\u002Fcredentials\n```\n\n> [!IMPORTANT]\n> Additional configuration options are available in [settings.py](https:\u002F\u002Fgithub.com\u002FPacktPublishing\u002FLLM-Engineers-Handbook\u002Fblob\u002Fmain\u002Fllm_engineering\u002Fsettings.py). Any variable in the `Settings` class can be configured through the `.env` file. \n\n## 🏗️ Infrastructure\n\n### Local infrastructure (for testing and development)\n\nWhen running the project locally, we host a MongoDB and Qdrant database using Docker. Also, a testing ZenML server is made available through their Python package.\n\n> [!WARNING]\n> You need Docker installed (>= v27.1.1)\n\nFor ease of use, you can start the whole local development infrastructure with the following command:\n```bash\npoetry poe local-infrastructure-up\n```\n\nAlso, you can stop the ZenML server and all the Docker containers using the following command:\n```bash\npoetry poe local-infrastructure-down\n```\n\n> [!WARNING]  \n> When running on MacOS, before starting the server, export the following environment variable:\n> `export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES`\n> Otherwise, the connection between the local server and pipeline will break. 🔗 More details in [this issue](https:\u002F\u002Fgithub.com\u002Fzenml-io\u002Fzenml\u002Fissues\u002F2369).\n> This is done by default when using Poe the Poet.\n\nStart the inference real-time RESTful API:\n```bash\npoetry poe run-inference-ml-service\n```\n\n> [!IMPORTANT]\n> The LLM microservice, called by the RESTful API, will work only after deploying the LLM to AWS SageMaker.\n\n#### ZenML\n\nDashboard URL: `localhost:8237`\n\nDefault credentials:\n  - `username`: default\n  - `password`: \n\n→ Find out more about using and setting up [ZenML](https:\u002F\u002Fdocs.zenml.io\u002F).\n\n#### Qdrant\n\nREST API URL: `localhost:6333`\n\nDashboard URL: `localhost:6333\u002Fdashboard`\n\n→ Find out more about using and setting up [Qdrant with Docker](https:\u002F\u002Fqdrant.tech\u002Fdocumentation\u002Fquick-start\u002F).\n\n#### MongoDB\n\nDatabase URI: `mongodb:\u002F\u002Fllm_engineering:llm_engineering@127.0.0.1:27017`\n\nDatabase name: `twin`\n\nDefault credentials:\n  - `username`: llm_engineering\n  - `password`: llm_engineering\n\n→ Find out more about using and setting up [MongoDB with Docker](https:\u002F\u002Fwww.mongodb.com\u002Fdocs\u002Fmanual\u002Ftutorial\u002Finstall-mongodb-community-with-docker).\n\nYou can search your MongoDB collections using your **IDEs MongoDB plugin** (which you have to install separately), where you have to use the database URI to connect to the MongoDB database hosted within the Docker container: `mongodb:\u002F\u002Fllm_engineering:llm_engineering@127.0.0.1:27017`\n\n> [!IMPORTANT]\n> Everything related to training or running the LLMs (e.g., training, evaluation, inference) can only be run if you set up AWS SageMaker, as explained in the next section on cloud infrastructure.\n\n### Cloud infrastructure (for production)\n\nHere we will quickly present how to deploy the project to AWS and other serverless services. We won't go into the details (as everything is presented in the book) but only point out the main steps you have to go through.\n\nFirst, reinstall your Python dependencies with the AWS group:\n```bash\npoetry install --with aws\n```\n\n#### AWS SageMaker\n\n> [!NOTE]\n> Chapter 10 provides step-by-step instructions in the section \"Implementing the LLM microservice using AWS SageMaker\".\n\nBy this point, we expect you to have AWS CLI installed and your AWS CLI and project's env vars (within the `.env` file) properly configured with an AWS admin user.\n\nTo ensure best practices, we must create a new AWS user restricted to creating and deleting only resources related to AWS SageMaker. Create it by running:\n```bash\npoetry poe create-sagemaker-role\n```\nIt will create a `sagemaker_user_credentials.json` file at the root of your repository with your new `AWS_ACCESS_KEY` and `AWS_SECRET_KEY` values. **But before replacing your new AWS credentials, also run the following command to create the execution role (to create it using your admin credentials).**\n\nTo create the IAM execution role used by AWS SageMaker to access other AWS resources on our behalf, run the following:\n```bash\npoetry poe create-sagemaker-execution-role\n```\nIt will create a `sagemaker_execution_role.json` file at the root of your repository with your new `AWS_ARN_ROLE` value. Add it to your `.env` file. \n\nOnce you've updated the `AWS_ACCESS_KEY`, `AWS_SECRET_KEY`, and `AWS_ARN_ROLE` values in your `.env` file, you can use AWS SageMaker. **Note that this step is crucial to complete the AWS setup.**\n\n#### Training\n\nWe start the training pipeline through ZenML by running the following:\n```bash\npoetry poe run-training-pipeline\n```\nThis will start the training code using the configs from `configs\u002Ftraining.yaml` directly in SageMaker. You can visualize the results in Comet ML's dashboard.\n\nWe start the evaluation pipeline through ZenML by running the following:\n```bash\npoetry poe run-evaluation-pipeline\n```\nThis will start the evaluation code using the configs from `configs\u002Fevaluating.yaml` directly in SageMaker. You can visualize the results in `*-results` datasets saved to your Hugging Face profile.\n\n#### Inference\n\nTo create an AWS SageMaker Inference Endpoint, run:\n```bash\npoetry poe deploy-inference-endpoint\n```\nTo test it out, run:\n```bash\npoetry poe test-sagemaker-endpoint\n```\nTo delete it, run:\n```bash\npoetry poe delete-inference-endpoint\n```\n\n#### AWS: ML pipelines, artifacts, and containers\n\nThe ML pipelines, artifacts, and containers are deployed to AWS by leveraging ZenML's deployment features. Thus, you must create an account with ZenML Cloud and follow their guide on deploying a ZenML stack to AWS. Otherwise, we provide step-by-step instructions in **Chapter 11**, section **Deploying the LLM Twin's pipelines to the cloud** on what you must do.  \n\n#### Qdrant & MongoDB\n\nWe leverage Qdrant's and MongoDB's serverless options when deploying the project. Thus, you can either follow [Qdrant's](https:\u002F\u002Fqdrant.tech\u002Fdocumentation\u002Fcloud\u002Fcreate-cluster\u002F) and [MongoDB's](https:\u002F\u002Fwww.mongodb.com\u002Fresources\u002Fproducts\u002Ffundamentals\u002Fmongodb-cluster-setup) tutorials on how to create a freemium cluster for each or go through **Chapter 11**, section **Deploying the LLM Twin's pipelines to the cloud** and follow our step-by-step instructions.\n\n#### GitHub Actions\n\nWe use GitHub Actions to implement our CI\u002FCD pipelines. To implement your own, you have to fork our repository and set the following env vars as Actions secrets in your forked repository:\n- `AWS_ACCESS_KEY_ID`\n- `AWS_SECRET_ACCESS_KEY`\n- `AWS_ECR_NAME`\n- `AWS_REGION`\n\nAlso, we provide instructions on how to set everything up in **Chapter 11**, section **Adding LLMOps to the LLM Twin**.\n\n#### Comet ML & Opik\n\nYou can visualize the results on their self-hosted dashboards if you create a Comet account and correctly set the `COMET_API_KEY` env var. As Opik is powered by Comet, you don't have to set up anything else along Comet:\n- [Comet ML (for experiment tracking)](https:\u002F\u002Fwww.comet.com\u002F?utm_source=llm_handbook&utm_medium=github&utm_campaign=opik)\n- [Opik (for prompt monitoring)](https:\u002F\u002Fwww.comet.com\u002Fopik?utm_source=llm_handbook&utm_medium=github&utm_campaign=opik)\n\n### 💰 Running the Project Costs\n\nWe will mostly stick to free tiers for all the services except for AWS and OpenAI's API, which are both pay-as-you-go services. The cost of running the project once, with our default values, will be roughly ~$25 (most of it comes from using AWS SageMaker for training and inference).\n\n## ⚡ Pipelines\n\nAll the ML pipelines will be orchestrated behind the scenes by [ZenML](https:\u002F\u002Fwww.zenml.io\u002F). A few exceptions exist when running utility scrips, such as exporting or importing from the data warehouse.\n\nThe ZenML pipelines are the entry point for most processes throughout this project. They are under the `pipelines\u002F` folder. Thus, when you want to understand or debug a workflow, starting with the ZenML pipeline is the best approach.\n\nTo see the pipelines running and their results:\n- go to your ZenML dashboard\n- go to the `Pipelines` section\n- click on a specific pipeline (e.g., `feature_engineering`)\n- click on a specific run (e.g., `feature_engineering_run_2024_06_20_18_40_24`)\n- click on a specific step or artifact of the DAG to find more details about it\n\nNow, let's explore all the pipelines you can run. From data collection to training, we will present them in their natural order to go through the LLM project end-to-end.\n\n### Data pipelines\n\nRun the data collection ETL:\n```bash\npoetry poe run-digital-data-etl\n```\n\n> [!WARNING]\n> You must have Chrome (or another Chromium-based browser) installed on your system for LinkedIn and Medium crawlers to work (which use Selenium under the hood). Based on your Chrome version, the Chromedriver will be automatically installed to enable Selenium support. Another option is to run everything using our Docker image if you don't want to install Chrome. For example, to run all the pipelines combined you can run `poetry poe run-docker-end-to-end-data-pipeline`. Note that the command can be tweaked to support any other pipeline.\n>\n> If, for any other reason, you don't have a Chromium-based browser installed and don't want to use Docker, you have two other options to bypass this Selenium issue:\n> - Comment out all the code related to Selenium, Chrome and all the links that use Selenium to crawl them (e.g., Medium), such as the `chromedriver_autoinstaller.install()` command from [application.crawlers.base](https:\u002F\u002Fgithub.com\u002FPacktPublishing\u002FLLM-Engineers-Handbook\u002Fblob\u002Fmain\u002Fllm_engineering\u002Fapplication\u002Fcrawlers\u002Fbase.py) and other static calls that check for Chrome drivers and Selenium.\n> - Install Google Chrome using your CLI in environments such as GitHub Codespaces or other cloud VMs using the same command as in our [Docker file](https:\u002F\u002Fgithub.com\u002FPacktPublishing\u002FLLM-Engineers-Handbook\u002Fblob\u002Fmain\u002FDockerfile#L10).\n\nTo add additional links to collect from, go to `configs\u002Fdigital_data_etl_[author_name].yaml` and add them to the `links` field. Also, you can create a completely new file and specify it at run time, like this: `python -m llm_engineering.interfaces.orchestrator.run --run-etl --etl-config-filename configs\u002Fdigital_data_etl_[your_name].yaml`\n\nRun the feature engineering pipeline:\n```bash\npoetry poe run-feature-engineering-pipeline\n```\n\nGenerate the instruct dataset:\n```bash\npoetry poe run-generate-instruct-datasets-pipeline\n```\n\nGenerate the preference dataset:\n```bash\npoetry poe run-generate-preference-datasets-pipeline\n```\n\nRun all of the above compressed into a single pipeline:\n```bash\npoetry poe run-end-to-end-data-pipeline\n```\n\n### Utility pipelines\n\nExport the data from the data warehouse to JSON files:\n```bash\npoetry poe run-export-data-warehouse-to-json\n```\n\nImport data to the data warehouse from JSON files (by default, it imports the data from the `data\u002Fdata_warehouse_raw_data` directory):\n```bash\npoetry poe run-import-data-warehouse-from-json\n```\n\nExport ZenML artifacts to JSON:\n```bash\npoetry poe run-export-artifact-to-json-pipeline\n```\n\nThis will export the following ZenML artifacts to the `output` folder as JSON files (it will take their latest version):\n- cleaned_documents.json\n- instruct_datasets.json\n- preference_datasets.json\n- raw_documents.json\n\nYou can configure what artifacts to export by tweaking the `configs\u002Fexport_artifact_to_json.yaml` configuration file.\n\n### Training pipelines\n\nRun the training pipeline:\n```bash\npoetry poe run-training-pipeline\n```\n\nRun the evaluation pipeline:\n```bash\npoetry poe run-evaluation-pipeline\n```\n\n> [!WARNING]\n> For this to work, make sure you properly configured AWS SageMaker as described in [Set up cloud infrastructure (for production)](#set-up-cloud-infrastructure-for-production).\n\n### Inference pipelines\n\nCall the RAG retrieval module with a test query:\n```bash\npoetry poe call-rag-retrieval-module\n```\n\nStart the inference real-time RESTful API:\n```bash\npoetry poe run-inference-ml-service\n```\n\nCall the inference real-time RESTful API with a test query:\n```bash\npoetry poe call-inference-ml-service\n```\n\nRemember that you can monitor the prompt traces on [Opik](https:\u002F\u002Fwww.comet.com\u002Fopik).\n\n> [!WARNING]\n> For the inference service to work, you must have the LLM microservice deployed to AWS SageMaker, as explained in the setup cloud infrastructure section.\n\n### Linting & formatting (QA)\n\nCheck or fix your linting issues:\n```bash\npoetry poe lint-check\npoetry poe lint-fix\n```\n\nCheck or fix your formatting issues:\n```bash\npoetry poe format-check\npoetry poe format-fix\n```\n\nCheck the code for leaked credentials:\n```bash\npoetry poe gitleaks-check\n```\n\n### Tests\n\nRun all the tests using the following command:\n```bash\npoetry poe test\n```\n\n## 🏃 Run project\n\nBased on the setup and usage steps described above, assuming the local and cloud infrastructure works and the `.env` is filled as expected, follow the next steps to run the LLM system end-to-end:\n\n### Data\n\n1. Collect data: `poetry poe run-digital-data-etl`\n\n2. Compute features: `poetry poe run-feature-engineering-pipeline`\n\n3. Compute instruct dataset: `poetry poe run-generate-instruct-datasets-pipeline`\n\n4. Compute preference alignment dataset: `poetry poe run-generate-preference-datasets-pipeline`\n\n### Training\n\n> [!IMPORTANT]\n> From now on, for these steps to work, you need to properly set up AWS SageMaker, such as running `poetry install --with aws` and filling in the AWS-related environment variables and configs.\n\n5. SFT fine-tuning Llamma 3.1: `poetry poe run-training-pipeline`\n\n6. For DPO, go to `configs\u002Ftraining.yaml`, change `finetuning_type` to `dpo`, and run `poetry poe run-training-pipeline` again\n\n7. Evaluate fine-tuned models: `poetry poe run-evaluation-pipeline`\n\n### Inference\n\n> [!IMPORTANT]\n> From now on, for these steps to work, you need to properly set up AWS SageMaker, such as running `poetry install --with aws` and filling in the AWS-related environment variables and configs.\n\n8. Call only the RAG retrieval module: `poetry poe call-rag-retrieval-module`\n\n9. Deploy the LLM Twin microservice to SageMaker: `poetry poe deploy-inference-endpoint`\n\n10. Test the LLM Twin microservice: `poetry poe test-sagemaker-endpoint`\n\n11. Start end-to-end RAG server: `poetry poe run-inference-ml-service`\n\n12. Test RAG server: `poetry poe call-inference-ml-service`\n\n## 📄 License\n\nThis course is an open-source project released under the MIT license. Thus, as long you distribute our LICENSE and acknowledge our work, you can safely clone or fork this project and use it as a source of inspiration for whatever you want (e.g., university projects, college degree projects, personal projects, etc.).\n","LLM工程师手册是一个旨在帮助开发者从基础到高级应用，使用最佳实践在AWS上部署大型语言模型（LLM）和检索增强生成（RAG）应用程序的项目。其核心功能包括数据收集与生成、LLM训练流水线、简单的RAG系统构建、生产就绪的AWS部署以及全面的监控测试框架。基于Python编写，利用了诸如Poetry、Docker等现代开发工具，并深度集成AWS服务以支持云原生部署。该项目非常适合希望深入理解并实践如何构建高效且可扩展的语言模型系统的AI工程师或研究者。",2,"2026-06-11 03:40:27","high_star"]