[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-6251":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":25,"hasPages":23,"topics":26,"createdAt":10,"pushedAt":10,"updatedAt":34,"readmeContent":35,"aiSummary":36,"trendingCount":16,"starSnapshotCount":16,"syncStatus":17,"lastSyncTime":37,"discoverSource":38},6251,"ffmpeg-libav-tutorial","leandromoreira\u002Fffmpeg-libav-tutorial","leandromoreira","FFmpeg libav tutorial - learn how media works from basic to transmuxing, transcoding and more. Translations: 🇺🇸 🇨🇳 🇰🇷 🇪🇸 🇻🇳 🇧🇷 🇷🇺","https:\u002F\u002Fgithub.com\u002Fleandromoreira\u002Fffmpeg-libav-tutorial",null,"C",10996,1017,270,37,0,2,7,34,8,44.02,"BSD 3-Clause \"New\" or \"Revised\" License",false,"master",true,[27,28,29,30,31,32,33],"codec","ffmpeg","ffmpeg-libraries","libav","transcode-video","tutorial","video-processing","2026-06-12 02:01:17","[🇨🇳](\u002FREADME-cn.md \"Simplified Chinese\")\n[🇰🇷](\u002FREADME-ko.md \"Korean\")\n[🇪🇸](\u002FREADME-es.md \"Spanish\")\n[🇻🇳](\u002FREADME-vn.md \"Vietnamese\")\n[🇧🇷](\u002FREADME-pt.md \"Portuguese\")\n[🇷🇺](\u002FREADME-ru.md \"Russian\")\n\n[![license](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-BSD--3--Clause-blue.svg)](https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Flicense-BSD--3--Clause-blue.svg)\n\nI was looking for a tutorial\u002Fbook that would teach me how to start to use [FFmpeg](https:\u002F\u002Fwww.ffmpeg.org\u002F) as a library (a.k.a. libav) and then I found the [\"How to write a video player in less than 1k lines\"](http:\u002F\u002Fdranger.com\u002Fffmpeg\u002F) tutorial.\nUnfortunately it was deprecated, so I decided to write this one.\n\nMost of the code in here will be in C **but don't worry**: you can easily understand and apply it to your preferred language.\nFFmpeg libav has lots of bindings for many languages like [python](https:\u002F\u002Fpyav.org\u002F), [go](https:\u002F\u002Fgithub.com\u002Fimkira\u002Fgo-libav) and even if your language doesn't have it, you can still support it through the `ffi` (here's an example with [Lua](https:\u002F\u002Fgithub.com\u002Fdaurnimator\u002Fffmpeg-lua-ffi\u002Fblob\u002Fmaster\u002Finit.lua)).\n\nWe'll start with a quick lesson about what is video, audio, codec and container and then we'll go to a crash course on how to use `FFmpeg` command line and finally we'll write code, feel free to skip directly to[ ](http:\u002F\u002Fnewmediarockstars.com\u002Fwp-content\u002Fuploads\u002F2015\u002F11\u002Fnintendo-direct-iwata.jpg)the section [Learn FFmpeg libav the Hard Way.](#learn-ffmpeg-libav-the-hard-way)\n\nSome people used to say that the Internet video streaming is the future of the traditional TV, in any case, the FFmpeg is something that is worth studying.\n\n__Table of Contents__\n\n* [Intro](#intro)\n  * [video - what you see!](#video---what-you-see)\n  * [audio - what you listen!](#audio---what-you-listen)\n  * [codec - shrinking data](#codec---shrinking-data)\n  * [container - a comfy place for audio and video](#container---a-comfy-place-for-audio-and-video)\n* [FFmpeg - command line](#ffmpeg---command-line)\n  * [FFmpeg command line tool 101](#ffmpeg-command-line-tool-101)\n* [Common video operations](#common-video-operations)\n  * [Transcoding](#transcoding)\n  * [Transmuxing](#transmuxing)\n  * [Transrating](#transrating)\n  * [Transsizing](#transsizing)\n  * [Bonus Round: Adaptive Streaming](#bonus-round-adaptive-streaming)\n  * [Going beyond](#going-beyond)\n* [Learn FFmpeg libav the Hard Way](#learn-ffmpeg-libav-the-hard-way)\n  * [Chapter 0 - The infamous hello world](#chapter-0---the-infamous-hello-world)\n    * [FFmpeg libav architecture](#ffmpeg-libav-architecture)\n  * [Chapter 1 - timing](#chapter-1---syncing-audio-and-video)\n  * [Chapter 2 - remuxing](#chapter-2---remuxing)\n  * [Chapter 3 - transcoding](#chapter-3---transcoding)\n\n# Intro\n\n## video - what you see!\n\nIf you have a sequence series of images and change them at a given frequency (let's say [24 images per second](https:\u002F\u002Fwww.filmindependent.org\u002Fblog\u002Fhacking-film-24-frames-per-second\u002F)), you will create an [illusion of movement](https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FPersistence_of_vision).\nIn summary this is the very basic idea behind a video: **a series of pictures \u002F frames running at a given rate**.\n\n\u003Cimg src=\"https:\u002F\u002Fupload.wikimedia.org\u002Fwikipedia\u002Fcommons\u002F1\u002F1f\u002FLinnet_kineograph_1886.jpg\" title=\"flip book\" height=\"280\">\u003C\u002Fimg>\n\nZeitgenössische Illustration (1886)\n\n## audio - what you listen!\n\nAlthough a muted video can express a variety of feelings, adding sound to it brings more pleasure to the experience.\n\nSound is the vibration that propagates as a wave of pressure, through the air or any other transmission medium, such as a gas, liquid or solid.\n\n> In a digital audio system, a microphone converts sound to an analog electrical signal, then an analog-to-digital converter (ADC) — typically using [pulse-code modulation (PCM)](https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FPulse-code_modulation) - converts the analog signal into a digital signal.\n\n![audio analog to digital](https:\u002F\u002Fupload.wikimedia.org\u002Fwikipedia\u002Fcommons\u002Fthumb\u002Fc\u002Fc7\u002FCPT-Sound-ADC-DAC.svg\u002F640px-CPT-Sound-ADC-DAC.svg.png \"audio analog to digital\")\n>[Source](https:\u002F\u002Fcommons.wikimedia.org\u002Fwiki\u002FFile:CPT-Sound-ADC-DAC.svg)\n\n## codec - shrinking data\n\n> CODEC is an electronic circuit or software that **compresses or decompresses digital audio\u002Fvideo.** It converts raw (uncompressed) digital audio\u002Fvideo to a compressed format or vice versa.\n> https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FVideo_codec\n\nBut if we chose to pack millions of images in a single file and called it a movie, we might end up with a huge file. Let's do the math:\n\nSuppose we are creating a video with a resolution of `1080 x 1920` (height x width) and that we'll spend `3 bytes` per pixel (the minimal point at a screen) to encode the color (or [24 bit color](https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FColor_depth#True_color_.2824-bit.29), what gives us 16,777,216 different colors) and this video runs at `24 frames per second` and it is `30 minutes` long.\n\n```c\ntoppf = 1080 * 1920 \u002F\u002Ftotal_of_pixels_per_frame\ncpp = 3 \u002F\u002Fcost_per_pixel\ntis = 30 * 60 \u002F\u002Ftime_in_seconds\nfps = 24 \u002F\u002Fframes_per_second\n\nrequired_storage = tis * fps * toppf * cpp\n```\n\nThis video would require approximately `250.28GB` of storage or `1.19 Gbps` of bandwidth! That's why we need to use a [CODEC](https:\u002F\u002Fgithub.com\u002Fleandromoreira\u002Fdigital_video_introduction#how-does-a-video-codec-work).\n\n## container - a comfy place for audio and video\n\n> A container or wrapper format is a metafile format whose specification describes how different elements of data and metadata coexist in a computer file.\n> https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FDigital_container_format\n\nA **single file that contains all the streams** (mostly the audio and video) and it also provides **synchronization and general metadata**, such as title, resolution and etc.\n\nUsually we can infer the format of a file by looking at its extension: for instance a `video.webm` is probably a video using the container [`webm`](https:\u002F\u002Fwww.webmproject.org\u002F).\n\n![container](\u002Fimg\u002Fcontainer.png)\n\n# FFmpeg - command line\n\n> A complete, cross-platform solution to record, convert and stream audio and video.\n\nTo work with multimedia we can use the AMAZING tool\u002Flibrary called [FFmpeg](https:\u002F\u002Fwww.ffmpeg.org\u002F). Chances are you already know\u002Fuse it directly or indirectly (do you use [Chrome?](https:\u002F\u002Fwww.chromium.org\u002Fdevelopers\u002Fdesign-documents\u002Fvideo)).\n\nIt has a command line program called `ffmpeg`, a very simple yet powerful binary.\nFor instance, you can convert from `mp4` to the container `avi` just by typing the follow command:\n\n```bash\n$ ffmpeg -i input.mp4 output.avi\n```\n\nWe just made a **remuxing** here, which is converting from one container to another one.\nTechnically FFmpeg could also be doing a transcoding but we'll talk about that later.\n\n## FFmpeg command line tool 101\n\nFFmpeg does have a [documentation](https:\u002F\u002Fwww.ffmpeg.org\u002Fffmpeg.html) that does a great job of explaining how it works.\n\n```bash\n# you can also look for the documentation using the command line\n\nffmpeg -h full | grep -A 10 -B 10 avoid_negative_ts\n```\n\nTo make things short, the FFmpeg command line program expects the following argument format to perform its actions `ffmpeg {1} {2} -i {3} {4} {5}`, where:\n\n1. global options\n2. input file options\n3. input url\n4. output file options\n5. output url\n\nThe parts 2, 3, 4 and 5 can be as many as you need.\nIt's easier to understand this argument format in action:\n\n``` bash\n# WARNING: this file is around 300MB\n$ wget -O bunny_1080p_60fps.mp4 http:\u002F\u002Fdistribution.bbb3d.renderfarming.net\u002Fvideo\u002Fmp4\u002Fbbb_sunflower_1080p_60fps_normal.mp4\n\n$ ffmpeg \\\n-y \\ # global options\n-c:a libfdk_aac \\ # input options\n-i bunny_1080p_60fps.mp4 \\ # input url\n-c:v libvpx-vp9 -c:a libvorbis \\ # output options\nbunny_1080p_60fps_vp9.webm # output url\n```\nThis command takes an input file `mp4` containing two streams (an audio encoded with `aac` CODEC and a video encoded using `h264` CODEC) and convert it to `webm`, changing its audio and video CODECs too.\n\nWe could simplify the command above but then be aware that FFmpeg will adopt or guess the default values for you.\nFor instance when you just type `ffmpeg -i input.avi output.mp4` what audio\u002Fvideo CODEC does it use to produce the `output.mp4`?\n\nWerner Robitza wrote a must read\u002Fexecute [tutorial about encoding and editing with FFmpeg](http:\u002F\u002Fslhck.info\u002Fffmpeg-encoding-course\u002F#\u002F).\n\n# Common video operations\n\nWhile working with audio\u002Fvideo we usually do a set of tasks with the media.\n\n## Transcoding\n\n![transcoding](\u002Fimg\u002Ftranscoding.png)\n\n**What?** the act of converting one of the streams (audio or video) from one CODEC to another one.\n\n**Why?** sometimes some devices (TVs, smartphones, console and etc) doesn't support X but Y and newer CODECs provide better compression rate.\n\n**How?** converting an `H264` (AVC) video to an `H265` (HEVC).\n```bash\n$ ffmpeg \\\n-i bunny_1080p_60fps.mp4 \\\n-c:v libx265 \\\nbunny_1080p_60fps_h265.mp4\n```\n\n## Transmuxing\n\n![transmuxing](\u002Fimg\u002Ftransmuxing.png)\n\n**What?** the act of converting from one format (container) to another one.\n\n**Why?** sometimes some devices (TVs, smartphones, console and etc) doesn't support X but Y and sometimes newer containers provide modern required features.\n\n**How?** converting a `mp4` to a `ts`.\n```bash\n$ ffmpeg \\\n-i bunny_1080p_60fps.mp4 \\\n-c copy \\ # just saying to ffmpeg to skip encoding\nbunny_1080p_60fps.ts\n```\n\n## Transrating\n\n![transrating](\u002Fimg\u002Ftransrating.png)\n\n**What?** the act of changing the bit rate, or producing other renditions.\n\n**Why?** people will try to watch your video in a `2G` (edge) connection using a less powerful smartphone or in a `fiber` Internet connection on their 4K TVs therefore you should offer more than one rendition of the same video with different bit rate.\n\n**How?** producing a rendition with bit rate between 964K and 3856K.\n```bash\n$ ffmpeg \\\n-i bunny_1080p_60fps.mp4 \\\n-minrate 964K -maxrate 3856K -bufsize 2000K \\\nbunny_1080p_60fps_transrating_964_3856.mp4\n```\n\nUsually we'll be using transrating with transsizing. Werner Robitza wrote another must read\u002Fexecute [series of posts about FFmpeg rate control](http:\u002F\u002Fslhck.info\u002Fposts\u002F).\n\n## Transsizing\n\n![transsizing](\u002Fimg\u002Ftranssizing.png)\n\n**What?** the act of converting from one resolution to another one. As said before transsizing is often used with transrating.\n\n**Why?** reasons are about the same as for the transrating.\n\n**How?** converting a `1080p` to a `480p` resolution.\n```bash\n$ ffmpeg \\\n-i bunny_1080p_60fps.mp4 \\\n-vf scale=480:-1 \\\nbunny_1080p_60fps_transsizing_480.mp4\n```\n\n## Bonus Round: Adaptive Streaming\n\n![adaptive streaming](\u002Fimg\u002Fadaptive-streaming.png)\n\n**What?** the act of producing many resolutions (bit rates) and split the media into chunks and serve them via http.\n\n**Why?** to provide a flexible media that can be watched on a low end smartphone or on a 4K TV, it's also easy to scale and deploy but it can add latency.\n\n**How?** creating an adaptive WebM using DASH.\n```bash\n# video streams\n$ ffmpeg -i bunny_1080p_60fps.mp4 -c:v libvpx-vp9 -s 160x90 -b:v 250k -keyint_min 150 -g 150 -an -f webm -dash 1 video_160x90_250k.webm\n\n$ ffmpeg -i bunny_1080p_60fps.mp4 -c:v libvpx-vp9 -s 320x180 -b:v 500k -keyint_min 150 -g 150 -an -f webm -dash 1 video_320x180_500k.webm\n\n$ ffmpeg -i bunny_1080p_60fps.mp4 -c:v libvpx-vp9 -s 640x360 -b:v 750k -keyint_min 150 -g 150 -an -f webm -dash 1 video_640x360_750k.webm\n\n$ ffmpeg -i bunny_1080p_60fps.mp4 -c:v libvpx-vp9 -s 640x360 -b:v 1000k -keyint_min 150 -g 150 -an -f webm -dash 1 video_640x360_1000k.webm\n\n$ ffmpeg -i bunny_1080p_60fps.mp4 -c:v libvpx-vp9 -s 1280x720 -b:v 1500k -keyint_min 150 -g 150 -an -f webm -dash 1 video_1280x720_1500k.webm\n\n# audio streams\n$ ffmpeg -i bunny_1080p_60fps.mp4 -c:a libvorbis -b:a 128k -vn -f webm -dash 1 audio_128k.webm\n\n# the DASH manifest\n$ ffmpeg \\\n -f webm_dash_manifest -i video_160x90_250k.webm \\\n -f webm_dash_manifest -i video_320x180_500k.webm \\\n -f webm_dash_manifest -i video_640x360_750k.webm \\\n -f webm_dash_manifest -i video_640x360_1000k.webm \\\n -f webm_dash_manifest -i video_1280x720_500k.webm \\\n -f webm_dash_manifest -i audio_128k.webm \\\n -c copy -map 0 -map 1 -map 2 -map 3 -map 4 -map 5 \\\n -f webm_dash_manifest \\\n -adaptation_sets \"id=0,streams=0,1,2,3,4 id=1,streams=5\" \\\n manifest.mpd\n```\n\nPS: I stole this example from the [Instructions to playback Adaptive WebM using DASH](http:\u002F\u002Fwiki.webmproject.org\u002Fadaptive-streaming\u002Finstructions-to-playback-adaptive-webm-using-dash)\n\n## Going beyond\n\nThere are [many and many other usages for FFmpeg](https:\u002F\u002Fgithub.com\u002Fleandromoreira\u002Fdigital_video_introduction\u002Fblob\u002Fmaster\u002Fencoding_pratical_examples.md#split-and-merge-smoothly).\nI use it in conjunction with *iMovie* to produce\u002Fedit some videos for YouTube and you can certainly use it professionally.\n\n# Learn FFmpeg libav the Hard Way\n\n> Don't you wonder sometimes 'bout sound and vision?\n> **David Robert Jones**\n\nSince the [FFmpeg](#ffmpeg---command-line) is so useful as a command line tool to do essential tasks over the media files, how can we use it in our programs?\n\nFFmpeg is [composed by several libraries](https:\u002F\u002Fwww.ffmpeg.org\u002Fdoxygen\u002Ftrunk\u002Findex.html) that can be integrated into our own programs.\nUsually, when you install FFmpeg, it installs automatically all these libraries. I'll be referring to the set of these libraries as **FFmpeg libav**.\n\n> This title is a homage to Zed Shaw's series [Learn X the Hard Way](https:\u002F\u002Flearncodethehardway.org\u002F), particularly his book Learn C the Hard Way.\n\n## Chapter 0 - The infamous hello world\nThis hello world actually won't show the message `\"hello world\"` in the terminal :tongue:\nInstead we're going to **print out information about the video**, things like its format (container), duration, resolution, audio channels and, in the end, we'll **decode some frames and save them as image files**.\n\n### FFmpeg libav architecture\n\nBut before we start to code, let's learn how **FFmpeg libav architecture** works and how its components communicate with others.\n\nHere's a diagram of the process of decoding a video:\n\n![ffmpeg libav architecture - decoding process](\u002Fimg\u002Fdecoding.png)\n\nYou'll first need to load your media file into a component called [`AVFormatContext`](https:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002FstructAVFormatContext.html) (the video container is also known as format).\nIt actually doesn't fully load the whole file: it often only reads the header.\n\nOnce we loaded the minimal **header of our container**, we can access its streams (think of them as a rudimentary audio and video data).\nEach stream will be available in a component called [`AVStream`](https:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002FstructAVStream.html).\n\n> Stream is a fancy name for a continuous flow of data.\n\nSuppose our video has two streams: an audio encoded with [AAC CODEC](https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FAdvanced_Audio_Coding) and a video encoded with [H264 (AVC) CODEC](https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FH.264\u002FMPEG-4_AVC). From each stream we can extract **pieces (slices) of data** called packets that will be loaded into components named [`AVPacket`](https:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002FstructAVPacket.html).\n\nThe **data inside the packets are still coded** (compressed) and in order to decode the packets, we need to pass them to a specific [`AVCodec`](https:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002FstructAVCodec.html).\n\nThe `AVCodec` will decode them into [`AVFrame`](https:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002FstructAVFrame.html) and finally, this component gives us **the uncompressed frame**.  Noticed that the same terminology\u002Fprocess is used either by audio and video stream.\n\n### Requirements\n\nSince some people were [facing issues while compiling or running the examples](https:\u002F\u002Fgithub.com\u002Fleandromoreira\u002Fffmpeg-libav-tutorial\u002Fissues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+compiling) **we're going to use [`Docker`](https:\u002F\u002Fdocs.docker.com\u002Finstall\u002F) as our development\u002Frunner environment,** we'll also use the big buck bunny video so if you don't have it locally just run the command `make fetch_small_bunny_video`.\n\n### Chapter 0 - code walkthrough\n\n> #### TLDR; show me the [code](\u002F0_hello_world.c) and execution.\n> ```bash\n> $ make run_hello\n> ```\n\nWe'll skip some details, but don't worry: the [source code is available at github](\u002F0_hello_world.c).\n\nWe're going to allocate memory to the component [`AVFormatContext`](http:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002FstructAVFormatContext.html) that will hold  information about the format (container).\n\n```c\nAVFormatContext *pFormatContext = avformat_alloc_context();\n```\n\nNow we're going to open the file and read its header and fill the `AVFormatContext` with minimal information about the format (notice that usually the codecs are not opened).\nThe function used to do this is [`avformat_open_input`](http:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002Fgroup__lavf__decoding.html#ga31d601155e9035d5b0e7efedc894ee49). It expects an `AVFormatContext`, a `filename` and two optional arguments: the [`AVInputFormat`](https:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002FstructAVInputFormat.html) (if you pass `NULL`, FFmpeg will guess the format) and the [`AVDictionary`](https:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002FstructAVDictionary.html) (which are the options to the demuxer).\n\n```c\navformat_open_input(&pFormatContext, filename, NULL, NULL);\n```\n\nWe can print the format name and the media duration:\n\n```c\nprintf(\"Format %s, duration %lld us\", pFormatContext->iformat->long_name, pFormatContext->duration);\n```\n\nTo access the `streams`, we need to read data from the media. The function [`avformat_find_stream_info`](https:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002Fgroup__lavf__decoding.html#gad42172e27cddafb81096939783b157bb) does that.\nNow, the `pFormatContext->nb_streams` will hold the amount of streams and the `pFormatContext->streams[i]` will give us the `i` stream (an [`AVStream`](https:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002FstructAVStream.html)).\n\n```c\navformat_find_stream_info(pFormatContext,  NULL);\n```\n\nNow we'll loop through all the streams.\n\n```c\nfor (int i = 0; i \u003C pFormatContext->nb_streams; i++)\n{\n  \u002F\u002F\n}\n```\n\nFor each stream, we're going to keep the [`AVCodecParameters`](https:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002FstructAVCodecParameters.html), which describes the properties of a codec used by the stream `i`.\n\n```c\nAVCodecParameters *pLocalCodecParameters = pFormatContext->streams[i]->codecpar;\n```\n\nWith the codec properties we can look up the proper CODEC querying the function [`avcodec_find_decoder`](https:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002Fgroup__lavc__decoding.html#ga19a0ca553277f019dd5b0fec6e1f9dca) and find the registered decoder for the codec id and return an [`AVCodec`](http:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002FstructAVCodec.html), the component that knows how to en**CO**de and **DEC**ode the stream.\n```c\nAVCodec *pLocalCodec = avcodec_find_decoder(pLocalCodecParameters->codec_id);\n```\n\nNow we can print information about the codecs.\n\n```c\n\u002F\u002F specific for video and audio\nif (pLocalCodecParameters->codec_type == AVMEDIA_TYPE_VIDEO) {\n  printf(\"Video Codec: resolution %d x %d\", pLocalCodecParameters->width, pLocalCodecParameters->height);\n} else if (pLocalCodecParameters->codec_type == AVMEDIA_TYPE_AUDIO) {\n  printf(\"Audio Codec: %d channels, sample rate %d\", pLocalCodecParameters->channels, pLocalCodecParameters->sample_rate);\n}\n\u002F\u002F general\nprintf(\"\\tCodec %s ID %d bit_rate %lld\", pLocalCodec->long_name, pLocalCodec->id, pLocalCodecParameters->bit_rate);\n```\n\nWith the codec, we can allocate memory for the [`AVCodecContext`](https:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002FstructAVCodecContext.html), which will hold the context for our decode\u002Fencode process, but then we need to fill this codec context with CODEC parameters; we do that with [`avcodec_parameters_to_context`](https:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002Fgroup__lavc__core.html#gac7b282f51540ca7a99416a3ba6ee0d16).\n\nOnce we filled the codec context, we need to open the codec. We call the function [`avcodec_open2`](https:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002Fgroup__lavc__core.html#ga11f785a188d7d9df71621001465b0f1d) and then we can use it.\n\n```c\nAVCodecContext *pCodecContext = avcodec_alloc_context3(pCodec);\navcodec_parameters_to_context(pCodecContext, pCodecParameters);\navcodec_open2(pCodecContext, pCodec, NULL);\n```\n\nNow we're going to read the packets from the stream and decode them into frames but first, we need to allocate memory for both components, the [`AVPacket`](https:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002FstructAVPacket.html) and [`AVFrame`](https:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002FstructAVFrame.html).\n\n```c\nAVPacket *pPacket = av_packet_alloc();\nAVFrame *pFrame = av_frame_alloc();\n```\n\nLet's feed our packets from the streams with the function [`av_read_frame`](https:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002Fgroup__lavf__decoding.html#ga4fdb3084415a82e3810de6ee60e46a61) while it has packets.\n\n```c\nwhile (av_read_frame(pFormatContext, pPacket) >= 0) {\n  \u002F\u002F...\n}\n```\n\nLet's **send the raw data packet** (compressed frame) to the decoder, through the codec context, using the function [`avcodec_send_packet`](https:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002Fgroup__lavc__decoding.html#ga58bc4bf1e0ac59e27362597e467efff3).\n\n```c\navcodec_send_packet(pCodecContext, pPacket);\n```\n\nAnd let's **receive the raw data frame** (uncompressed frame) from the decoder, through the same codec context, using the function [`avcodec_receive_frame`](https:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002Fgroup__lavc__decoding.html#ga11e6542c4e66d3028668788a1a74217c).\n\n```c\navcodec_receive_frame(pCodecContext, pFrame);\n```\n\nWe can print the frame number, the [PTS](https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FPresentation_timestamp), DTS, [frame type](https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FVideo_compression_picture_types) and etc.\n\n```c\nprintf(\n    \"Frame %c (%d) pts %d dts %d key_frame %d [coded_picture_number %d, display_picture_number %d]\",\n    av_get_picture_type_char(pFrame->pict_type),\n    pCodecContext->frame_number,\n    pFrame->pts,\n    pFrame->pkt_dts,\n    pFrame->key_frame,\n    pFrame->coded_picture_number,\n    pFrame->display_picture_number\n);\n```\n\nFinally we can save our decoded frame into a [simple gray image](https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FNetpbm_format#PGM_example). The process is very simple, we'll use the `pFrame->data` where the index is related to the [planes Y, Cb and Cr](https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FYCbCr), we just picked `0` (Y) to save our gray image.\n\n```c\nsave_gray_frame(pFrame->data[0], pFrame->linesize[0], pFrame->width, pFrame->height, frame_filename);\n\nstatic void save_gray_frame(unsigned char *buf, int wrap, int xsize, int ysize, char *filename)\n{\n    FILE *f;\n    int i;\n    f = fopen(filename,\"w\");\n    \u002F\u002F writing the minimal required header for a pgm file format\n    \u002F\u002F portable graymap format -> https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FNetpbm_format#PGM_example\n    fprintf(f, \"P5\\n%d %d\\n%d\\n\", xsize, ysize, 255);\n\n    \u002F\u002F writing line by line\n    for (i = 0; i \u003C ysize; i++)\n        fwrite(buf + i * wrap, 1, xsize, f);\n    fclose(f);\n}\n```\n\nAnd voilà! Now we have a gray scale image with 2MB:\n\n![saved frame](\u002Fimg\u002Fgenerated_frame.png)\n\n## Chapter 1 - syncing audio and video\n\n> **Be the player** - a young JS developer writing a new MSE video player.\n\nBefore we move to [code a transcoding example](#chapter-2---transcoding) let's talk about **timing**, or how a video player knows the right time to play a frame.\n\nIn the last example, we saved some frames that can be seen here:\n\n![frame 0](\u002Fimg\u002Fhello_world_frames\u002Fframe0.png)\n![frame 1](\u002Fimg\u002Fhello_world_frames\u002Fframe1.png)\n![frame 2](\u002Fimg\u002Fhello_world_frames\u002Fframe2.png)\n![frame 3](\u002Fimg\u002Fhello_world_frames\u002Fframe3.png)\n![frame 4](\u002Fimg\u002Fhello_world_frames\u002Fframe4.png)\n![frame 5](\u002Fimg\u002Fhello_world_frames\u002Fframe5.png)\n\nWhen we're designing a video player we need to **play each frame at a given pace**, otherwise it would be hard to pleasantly see the video either because it's playing so fast or so slow.\n\nTherefore we need to introduce some logic to play each frame smoothly. For that matter, each frame has a **presentation timestamp** (PTS) which is an increasing number factored in a **timebase** that is a rational number (where the denominator is known as **timescale**) divisible by the **frame rate (fps)**.\n\nIt's easier to understand when we look at some examples, let's simulate some scenarios.\n\nFor a `fps=60\u002F1` and `timebase=1\u002F60000` each PTS will increase `timescale \u002F fps = 1000` therefore the **PTS real time** for each frame could be (supposing it started at 0):\n\n* `frame=0, PTS = 0, PTS_TIME = 0`\n* `frame=1, PTS = 1000, PTS_TIME = PTS * timebase = 0.016`\n* `frame=2, PTS = 2000, PTS_TIME = PTS * timebase = 0.033`\n\nFor almost the same scenario but with a timebase equal to `1\u002F60`.\n\n* `frame=0, PTS = 0, PTS_TIME = 0`\n* `frame=1, PTS = 1, PTS_TIME = PTS * timebase = 0.016`\n* `frame=2, PTS = 2, PTS_TIME = PTS * timebase = 0.033`\n* `frame=3, PTS = 3, PTS_TIME = PTS * timebase = 0.050`\n\nFor a `fps=25\u002F1` and `timebase=1\u002F75` each PTS will increase `timescale \u002F fps = 3` and the PTS time could be:\n\n* `frame=0, PTS = 0, PTS_TIME = 0`\n* `frame=1, PTS = 3, PTS_TIME = PTS * timebase = 0.04`\n* `frame=2, PTS = 6, PTS_TIME = PTS * timebase = 0.08`\n* `frame=3, PTS = 9, PTS_TIME = PTS * timebase = 0.12`\n* ...\n* `frame=24, PTS = 72, PTS_TIME = PTS * timebase = 0.96`\n* ...\n* `frame=4064, PTS = 12192, PTS_TIME = PTS * timebase = 162.56`\n\nNow with the `pts_time` we can find a way to render this synched with audio `pts_time` or with a system clock. The FFmpeg libav provides these info through its API:\n\n- fps = [`AVStream->avg_frame_rate`](https:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002FstructAVStream.html#a946e1e9b89eeeae4cab8a833b482c1ad)\n- tbr = [`AVStream->r_frame_rate`](https:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002FstructAVStream.html#ad63fb11cc1415e278e09ddc676e8a1ad)\n- tbn = [`AVStream->time_base`](https:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002FstructAVStream.html#a9db755451f14e2bf590d4b85d82b32e6)\n\nJust out of curiosity, the frames we saved were sent in a DTS order (frames: 1,6,4,2,3,5) but played at a PTS order (frames: 1,2,3,4,5). Also, notice how cheap are B-Frames in comparison to P or I-Frames.\n\n```\nLOG: AVStream->r_frame_rate 60\u002F1\nLOG: AVStream->time_base 1\u002F60000\n...\nLOG: Frame 1 (type=I, size=153797 bytes) pts 6000 key_frame 1 [DTS 0]\nLOG: Frame 2 (type=B, size=8117 bytes) pts 7000 key_frame 0 [DTS 3]\nLOG: Frame 3 (type=B, size=8226 bytes) pts 8000 key_frame 0 [DTS 4]\nLOG: Frame 4 (type=B, size=17699 bytes) pts 9000 key_frame 0 [DTS 2]\nLOG: Frame 5 (type=B, size=6253 bytes) pts 10000 key_frame 0 [DTS 5]\nLOG: Frame 6 (type=P, size=34992 bytes) pts 11000 key_frame 0 [DTS 1]\n```\n\n## Chapter 2 - remuxing\n\nRemuxing is the act of changing from one format (container) to another, for instance, we can change a [MPEG-4](https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FMPEG-4_Part_14) video to a [MPEG-TS](https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FMPEG_transport_stream) one without much pain using FFmpeg:\n\n```bash\nffmpeg input.mp4 -c copy output.ts\n```\n\nIt'll demux the mp4 but it won't decode or encode it (`-c copy`) and in the end, it'll mux it into a `mpegts` file. If you don't provide the format `-f` the ffmpeg will try to guess it based on the file's extension.\n\nThe general usage of FFmpeg or the libav follows a pattern\u002Farchitecture or workflow:\n* **[protocol layer](https:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002Fprotocols_8c.html)** - it accepts an `input` (a `file` for instance but it could be a `rtmp` or `HTTP` input as well)\n* **[format layer](https:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002Fgroup__libavf.html)** - it `demuxes` its content, revealing mostly metadata and its streams\n* **[codec layer](https:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002Fgroup__libavc.html)** - it `decodes` its compressed streams data \u003Csup>*optional*\u003C\u002Fsup>\n* **[pixel layer](https:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002Fgroup__lavfi.html)** - it can also apply some `filters` to the raw frames (like resizing)\u003Csup>*optional*\u003C\u002Fsup>\n* and then it does the reverse path\n* **[codec layer](https:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002Fgroup__libavc.html)** - it `encodes` (or `re-encodes` or even `transcodes`) the raw frames\u003Csup>*optional*\u003C\u002Fsup>\n* **[format layer](https:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002Fgroup__libavf.html)** - it `muxes` (or `remuxes`) the raw streams (the compressed data)\n* **[protocol layer](https:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002Fprotocols_8c.html)** - and finally the muxed data is sent to an `output` (another file or maybe a network remote server)\n\n![ffmpeg libav workflow](\u002Fimg\u002Fffmpeg_libav_workflow.jpeg)\n> This graph is strongly inspired by [Leixiaohua's](http:\u002F\u002Fleixiaohua1020.github.io\u002F#ffmpeg-development-examples) and [Slhck's](https:\u002F\u002Fslhck.info\u002Fffmpeg-encoding-course\u002F#\u002F9) works.\n\nNow let's code an example using libav to provide the same effect as in `ffmpeg input.mp4 -c copy output.ts`.\n\nWe're going to read from an input (`input_format_context`) and change it to another output (`output_format_context`).\n\n```c\nAVFormatContext *input_format_context = NULL;\nAVFormatContext *output_format_context = NULL;\n```\n\nWe start doing the usually allocate memory and open the input format. For this specific case, we're going to open an input file and allocate memory for an output file.\n\n```c\nif ((ret = avformat_open_input(&input_format_context, in_filename, NULL, NULL)) \u003C 0) {\n  fprintf(stderr, \"Could not open input file '%s'\", in_filename);\n  goto end;\n}\nif ((ret = avformat_find_stream_info(input_format_context, NULL)) \u003C 0) {\n  fprintf(stderr, \"Failed to retrieve input stream information\");\n  goto end;\n}\n\navformat_alloc_output_context2(&output_format_context, NULL, NULL, out_filename);\nif (!output_format_context) {\n  fprintf(stderr, \"Could not create output context\\n\");\n  ret = AVERROR_UNKNOWN;\n  goto end;\n}\n```\n\nWe're going to remux only the video, audio and subtitle types of streams so we're holding what streams we'll be using into an array of indexes.\n\n```c\nnumber_of_streams = input_format_context->nb_streams;\nstreams_list = av_mallocz_array(number_of_streams, sizeof(*streams_list));\n```\n\nJust after we allocated the required memory, we're going to loop throughout all the streams and for each one we need to create new out stream into our output format context, using the [avformat_new_stream](https:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002Fgroup__lavf__core.html#gadcb0fd3e507d9b58fe78f61f8ad39827) function. Notice that we're marking all the streams that aren't video, audio or subtitle so we can skip them after.\n\n```c\nfor (i = 0; i \u003C input_format_context->nb_streams; i++) {\n  AVStream *out_stream;\n  AVStream *in_stream = input_format_context->streams[i];\n  AVCodecParameters *in_codecpar = in_stream->codecpar;\n  if (in_codecpar->codec_type != AVMEDIA_TYPE_AUDIO &&\n      in_codecpar->codec_type != AVMEDIA_TYPE_VIDEO &&\n      in_codecpar->codec_type != AVMEDIA_TYPE_SUBTITLE) {\n    streams_list[i] = -1;\n    continue;\n  }\n  streams_list[i] = stream_index++;\n  out_stream = avformat_new_stream(output_format_context, NULL);\n  if (!out_stream) {\n    fprintf(stderr, \"Failed allocating output stream\\n\");\n    ret = AVERROR_UNKNOWN;\n    goto end;\n  }\n  ret = avcodec_parameters_copy(out_stream->codecpar, in_codecpar);\n  if (ret \u003C 0) {\n    fprintf(stderr, \"Failed to copy codec parameters\\n\");\n    goto end;\n  }\n}\n```\n\nNow we can create the output file.\n\n```c\nif (!(output_format_context->oformat->flags & AVFMT_NOFILE)) {\n  ret = avio_open(&output_format_context->pb, out_filename, AVIO_FLAG_WRITE);\n  if (ret \u003C 0) {\n    fprintf(stderr, \"Could not open output file '%s'\", out_filename);\n    goto end;\n  }\n}\n\nret = avformat_write_header(output_format_context, NULL);\nif (ret \u003C 0) {\n  fprintf(stderr, \"Error occurred when opening output file\\n\");\n  goto end;\n}\n```\n\nAfter that, we can copy the streams, packet by packet, from our input to our output streams. We'll loop while it has packets (`av_read_frame`), for each packet we need to re-calculate the PTS and DTS to finally write it (`av_interleaved_write_frame`) to our output format context.\n\n```c\nwhile (1) {\n  AVStream *in_stream, *out_stream;\n  ret = av_read_frame(input_format_context, &packet);\n  if (ret \u003C 0)\n    break;\n  in_stream  = input_format_context->streams[packet.stream_index];\n  if (packet.stream_index >= number_of_streams || streams_list[packet.stream_index] \u003C 0) {\n    av_packet_unref(&packet);\n    continue;\n  }\n  packet.stream_index = streams_list[packet.stream_index];\n  out_stream = output_format_context->streams[packet.stream_index];\n  \u002F* copy packet *\u002F\n  packet.pts = av_rescale_q_rnd(packet.pts, in_stream->time_base, out_stream->time_base, AV_ROUND_NEAR_INF|AV_ROUND_PASS_MINMAX);\n  packet.dts = av_rescale_q_rnd(packet.dts, in_stream->time_base, out_stream->time_base, AV_ROUND_NEAR_INF|AV_ROUND_PASS_MINMAX);\n  packet.duration = av_rescale_q(packet.duration, in_stream->time_base, out_stream->time_base);\n  \u002F\u002F https:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002FstructAVPacket.html#ab5793d8195cf4789dfb3913b7a693903\n  packet.pos = -1;\n\n  \u002F\u002Fhttps:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002Fgroup__lavf__encoding.html#ga37352ed2c63493c38219d935e71db6c1\n  ret = av_interleaved_write_frame(output_format_context, &packet);\n  if (ret \u003C 0) {\n    fprintf(stderr, \"Error muxing packet\\n\");\n    break;\n  }\n  av_packet_unref(&packet);\n}\n```\n\nTo finalize we need to write the stream trailer to an output media file with [av_write_trailer](https:\u002F\u002Fffmpeg.org\u002Fdoxygen\u002Ftrunk\u002Fgroup__lavf__encoding.html#ga7f14007e7dc8f481f054b21614dfec13) function.\n\n```c\nav_write_trailer(output_format_context);\n```\n\nNow we're ready to test it and the first test will be a format (video container) conversion from a MP4 to a MPEG-TS video file. We're basically making the command line `ffmpeg input.mp4 -c copy output.ts` with libav.\n\n```bash\nmake run_remuxing_ts\n```\n\nIt's working!!! don't you trust me?! you shouldn't, we can check it with `ffprobe`:\n\n```bash\nffprobe -i remuxed_small_bunny_1080p_60fps.ts\n\nInput #0, mpegts, from 'remuxed_small_bunny_1080p_60fps.ts':\n  Duration: 00:00:10.03, start: 0.000000, bitrate: 2751 kb\u002Fs\n  Program 1\n    Metadata:\n      service_name    : Service01\n      service_provider: FFmpeg\n    Stream #0:0[0x100]: Video: h264 (High) ([27][0][0][0] \u002F 0x001B), yuv420p(progressive), 1920x1080 [SAR 1:1 DAR 16:9], 60 fps, 60 tbr, 90k tbn, 120 tbc\n    Stream #0:1[0x101]: Audio: ac3 ([129][0][0][0] \u002F 0x0081), 48000 Hz, 5.1(side), fltp, 320 kb\u002Fs\n```\n\nTo sum up what we did here in a graph, we can revisit our initial [idea about how libav works](https:\u002F\u002Fgithub.com\u002Fleandromoreira\u002Fffmpeg-libav-tutorial#ffmpeg-libav-architecture) but showing that we skipped the codec part.\n\n![remuxing libav components](\u002Fimg\u002Fremuxing_libav_components.png)\n\nBefore we end this chapter I'd like to show an important part of the remuxing process, **you can pass options to the muxer**. Let's say we want to delivery [MPEG-DASH](https:\u002F\u002Fdeveloper.mozilla.org\u002Fen-US\u002Fdocs\u002FWeb\u002FApps\u002FFundamentals\u002FAudio_and_video_delivery\u002FSetting_up_adaptive_streaming_media_sources#MPEG-DASH_Encoding) format for that matter we need to use [fragmented mp4](https:\u002F\u002Fstackoverflow.com\u002Fa\u002F35180327) (sometimes referred as `fmp4`) instead of MPEG-TS or plain MPEG-4.\n\nWith the [command line we can do that easily](https:\u002F\u002Fdeveloper.mozilla.org\u002Fen-US\u002Fdocs\u002FWeb\u002FAPI\u002FMedia_Source_Extensions_API\u002FTranscoding_assets_for_MSE#Fragmenting).\n\n```\nffmpeg -i non_fragmented.mp4 -movflags frag_keyframe+empty_moov+default_base_moof fragmented.mp4\n```\n\nAlmost equally easy as the command line is the libav version of it, we just need to pass the options when write the output header, just before the packets copy.\n\n```c\nAVDictionary* opts = NULL;\nav_dict_set(&opts, \"movflags\", \"frag_keyframe+empty_moov+default_base_moof\", 0);\nret = avformat_write_header(output_format_context, &opts);\n```\n\nWe now can generate this fragmented mp4 file:\n\n```bash\nmake run_remuxing_fragmented_mp4\n```\n\nBut to make sure that I'm not lying to you. You can use the amazing site\u002Ftool [gpac\u002Fmp4box.js](http:\u002F\u002Fdownload.tsi.telecom-paristech.fr\u002Fgpac\u002Fmp4box.js\u002Ffilereader.html) or the site [http:\u002F\u002Fmp4parser.com\u002F](http:\u002F\u002Fmp4parser.com\u002F) to see the differences, first load up the \"common\" mp4.\n\n![mp4 boxes](\u002Fimg\u002Fboxes_normal_mp4.png)\n\nAs you can see it has a single `mdat` atom\u002Fbox, **this is place where the video and audio frames are**. Now load the fragmented mp4 to see which how it spreads the `mdat` boxes.\n\n![fragmented mp4 boxes](\u002Fimg\u002Fboxes_fragmente_mp4.png)\n\n## Chapter 3 - transcoding\n\n> #### TLDR; show me the [code](\u002F3_transcoding.c) and execution.\n> ```bash\n> $ make run_transcoding\n> ```\n> We'll skip some details, but don't worry: the [source code is available at github](\u002F3_transcoding.c).\n\n\n\nIn this chapter, we're going to create a minimalist transcoder, written in C, that can convert videos coded in H264 to H265 using **FFmpeg\u002Flibav** library specifically [libavcodec](https:\u002F\u002Fffmpeg.org\u002Flibavcodec.html), libavformat, and libavutil.\n\n![media transcoding flow](\u002Fimg\u002Ftranscoding_flow.png)\n\n> _Just a quick recap:_ The [**AVFormatContext**](https:\u002F\u002Fwww.ffmpeg.org\u002Fdoxygen\u002Ftrunk\u002FstructAVFormatContext.html) is the abstraction for the format of the media file, aka container (ex: MKV, MP4, Webm, TS). The [**AVStream**](https:\u002F\u002Fwww.ffmpeg.org\u002Fdoxygen\u002Ftrunk\u002FstructAVStream.html) represents each type of data for a given format (ex: audio, video, subtitle, metadata). The [**AVPacket**](https:\u002F\u002Fwww.ffmpeg.org\u002Fdoxygen\u002Ftrunk\u002FstructAVPacket.html) is a slice of compressed data obtained from the `AVStream` that can be decoded by an [**AVCodec**](https:\u002F\u002Fwww.ffmpeg.org\u002Fdoxygen\u002Ftrunk\u002FstructAVCodec.html) (ex: av1, h264, vp9, hevc) generating a raw data called [**AVFrame**](https:\u002F\u002Fwww.ffmpeg.org\u002Fdoxygen\u002Ftrunk\u002FstructAVFrame.html).\n\n### Transmuxing\n\nLet's start with the simple transmuxing operation and then we can build upon this code, the first step is to **load the input file**.\n\n```c\n\u002F\u002F Allocate an AVFormatContext\navfc = avformat_alloc_context();\n\u002F\u002F Open an input stream and read the header.\navformat_open_input(avfc, in_filename, NULL, NULL);\n\u002F\u002F Read packets of a media file to get stream information.\navformat_find_stream_info(avfc, NULL);\n```\n\nNow we're going to set up the decoder, the `AVFormatContext` will give us access to all the `AVStream` components and for each one of them, we can get their `AVCodec` and create the particular `AVCodecContext` and finally we can open the given codec so we can proceed to the decoding process.\n\n>  The [**AVCodecContext**](https:\u002F\u002Fwww.ffmpeg.org\u002Fdoxygen\u002Ftrunk\u002FstructAVCodecContext.html) holds data about media configuration such as bit rate, frame rate, sample rate, channels, height, and many others.\n\n```c\nfor (int i = 0; i \u003C avfc->nb_streams; i++)\n{\n  AVStream *avs = avfc->streams[i];\n  AVCodec *avc = avcodec_find_decoder(avs->codecpar->codec_id);\n  AVCodecContext *avcc = avcodec_alloc_context3(*avc);\n  avcodec_parameters_to_context(*avcc, avs->codecpar);\n  avcodec_open2(*avcc, *avc, NULL);\n}\n```\n\nWe need to prepare the output media file for transmuxing as well, we first **allocate memory** for the output `AVFormatContext`. We create **each stream** in the output format. In order to pack the stream properly, we **copy the codec parameters** from the decoder.\n\nWe **set the flag** `AV_CODEC_FLAG_GLOBAL_HEADER` which tells the encoder that it can use the global headers and finally we open the output **file for write** and persist the headers.\n\n```c\navformat_alloc_output_context2(&encoder_avfc, NULL, NULL, out_filename);\n\nAVStream *avs = avformat_new_stream(encoder_avfc, NULL);\navcodec_parameters_copy(avs->codecpar, decoder_avs->codecpar);\n\nif (encoder_avfc->oformat->flags & AVFMT_GLOBALHEADER)\n  encoder_avfc->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;\n\navio_open(&encoder_avfc->pb, encoder->filename, AVIO_FLAG_WRITE);\navformat_write_header(encoder->avfc, &muxer_opts);\n\n```\n\nWe're getting the `AVPacket`'s from the decoder, adjusting the timestamps, and write the packet properly to the output file. Even though the function `av_interleaved_write_frame` says \"write frame\" we are storing the packet. We finish the transmuxing process by writing the stream trailer to the file.\n\n```c\nAVFrame *input_frame = av_frame_alloc();\nAVPacket *input_packet = av_packet_alloc();\n\nwhile (av_read_frame(decoder_avfc, input_packet) >= 0)\n{\n  av_packet_rescale_ts(input_packet, decoder_video_avs->time_base, encoder_video_avs->time_base);\n  av_interleaved_write_frame(*avfc, input_packet) \u003C 0));\n}\n\nav_write_trailer(encoder_avfc);\n```\n\n### Transcoding\n\nThe previous section showed a simple transmuxer program, now we're going to add the capability to encode files, specifically we're going to enable it to transcode videos from `h264` to `h265`.\n\nAfter we prepared the decoder but before we arrange the output media file we're going to set up the encoder.\n\n* Create the video `AVStream` in the encoder, [`avformat_new_stream`](https:\u002F\u002Fwww.ffmpeg.org\u002Fdoxygen\u002Ftrunk\u002Fgroup__lavf__core.html#gadcb0fd3e507d9b58fe78f61f8ad39827)\n* Use the `AVCodec` called `libx265`, [`avcodec_find_encoder_by_name`](https:\u002F\u002Fwww.ffmpeg.org\u002Fdoxygen\u002Ftrunk\u002Fgroup__lavc__encoding.html#gaa614ffc38511c104bdff4a3afa086d37)\n* Create the `AVCodecContext` based in the created codec, [`avcodec_alloc_context3`](https:\u002F\u002Fwww.ffmpeg.org\u002Fdoxygen\u002Ftrunk\u002Fgroup__lavc__core.html#gae80afec6f26df6607eaacf39b561c315)\n* Set up basic attributes for the transcoding session, and\n* Open the codec and copy parameters from the context to the stream. [`avcodec_open2`](https:\u002F\u002Fwww.ffmpeg.org\u002Fdoxygen\u002Ftrunk\u002Fgroup__lavc__core.html#ga11f785a188d7d9df71621001465b0f1d) and [`avcodec_parameters_from_context`](https:\u002F\u002Fwww.ffmpeg.org\u002Fdoxygen\u002Ftrunk\u002Fgroup__lavc__core.html#ga0c7058f764778615e7978a1821ab3cfe)\n\n```c\nAVRational input_framerate = av_guess_frame_rate(decoder_avfc, decoder_video_avs, NULL);\nAVStream *video_avs = avformat_new_stream(encoder_avfc, NULL);\n\nchar *codec_name = \"libx265\";\nchar *codec_priv_key = \"x265-params\";\n\u002F\u002F we're going to use internal options for the x265\n\u002F\u002F it disables the scene change detection and fix then\n\u002F\u002F GOP on 60 frames.\nchar *codec_priv_value = \"keyint=60:min-keyint=60:scenecut=0\";\n\nAVCodec *video_avc = avcodec_find_encoder_by_name(codec_name);\nAVCodecContext *video_avcc = avcodec_alloc_context3(video_avc);\n\u002F\u002F encoder codec params\nav_opt_set(sc->video_avcc->priv_data, codec_priv_key, codec_priv_value, 0);\nvideo_avcc->height = decoder_ctx->height;\nvideo_avcc->width = decoder_ctx->width;\nvideo_avcc->pix_fmt = video_avc->pix_fmts[0];\n\u002F\u002F control rate\nvideo_avcc->bit_rate = 2 * 1000 * 1000;\nvideo_avcc->rc_buffer_size = 4 * 1000 * 1000;\nvideo_avcc->rc_max_rate = 2 * 1000 * 1000;\nvideo_avcc->rc_min_rate = 2.5 * 1000 * 1000;\n\u002F\u002F time base\nvideo_avcc->time_base = av_inv_q(input_framerate);\nvideo_avs->time_base = sc->video_avcc->time_base;\n\navcodec_open2(sc->video_avcc, sc->video_avc, NULL);\navcodec_parameters_from_context(sc->video_avs->codecpar, sc->video_avcc);\n```\n\nWe need to expand our decoding loop for the video stream transcoding:\n\n* Send the empty `AVPacket` to the decoder, [`avcodec_send_packet`](https:\u002F\u002Fwww.ffmpeg.org\u002Fdoxygen\u002Ftrunk\u002Fgroup__lavc__decoding.html#ga58bc4bf1e0ac59e27362597e467efff3)\n* Receive the uncompressed `AVFrame`, [`avcodec_receive_frame`](https:\u002F\u002Fwww.ffmpeg.org\u002Fdoxygen\u002Ftrunk\u002Fgroup__lavc__decoding.html#ga11e6542c4e66d3028668788a1a74217c)\n* Start to transcode this raw frame,\n* Send the raw frame, [`avcodec_send_frame`](https:\u002F\u002Fwww.ffmpeg.org\u002Fdoxygen\u002Ftrunk\u002Fgroup__lavc__decoding.html#ga9395cb802a5febf1f00df31497779169)\n* Receive the compressed, based on our codec, `AVPacket`, [`avcodec_receive_packet`](https:\u002F\u002Fwww.ffmpeg.org\u002Fdoxygen\u002Ftrunk\u002Fgroup__lavc__decoding.html#ga5b8eff59cf259747cf0b31563e38ded6)\n* Set up the timestamp, and [`av_packet_rescale_ts`](https:\u002F\u002Fwww.ffmpeg.org\u002Fdoxygen\u002Ftrunk\u002Fgroup__lavc__packet.html#gae5c86e4d93f6e7aa62ef2c60763ea67e)\n* Write it to the output file. [`av_interleaved_write_frame`](https:\u002F\u002Fwww.ffmpeg.org\u002Fdoxygen\u002Ftrunk\u002Fgroup__lavf__encoding.html#ga37352ed2c63493c38219d935e71db6c1)\n\n```c\nAVFrame *input_frame = av_frame_alloc();\nAVPacket *input_packet = av_packet_alloc();\n\nwhile (av_read_frame(decoder_avfc, input_packet) >= 0)\n{\n  int response = avcodec_send_packet(decoder_video_avcc, input_packet);\n  while (response >= 0) {\n    response = avcodec_receive_frame(decoder_video_avcc, input_frame);\n    if (response == AVERROR(EAGAIN) || response == AVERROR_EOF) {\n      break;\n    } else if (response \u003C 0) {\n      return response;\n    }\n    if (response >= 0) {\n      encode(encoder_avfc, decoder_video_avs, encoder_video_avs, decoder_video_avcc, input_packet->stream_index);\n    }\n    av_frame_unref(input_frame);\n  }\n  av_packet_unref(input_packet);\n}\nav_write_trailer(encoder_avfc);\n\n\u002F\u002F used function\nint encode(AVFormatContext *avfc, AVStream *dec_video_avs, AVStream *enc_video_avs, AVCodecContext video_avcc int index) {\n  AVPacket *output_packet = av_packet_alloc();\n  int response = avcodec_send_frame(video_avcc, input_frame);\n\n  while (response >= 0) {\n    response = avcodec_receive_packet(video_avcc, output_packet);\n    if (response == AVERROR(EAGAIN) || response == AVERROR_EOF) {\n      break;\n    } else if (response \u003C 0) {\n      return -1;\n    }\n\n    output_packet->stream_index = index;\n    output_packet->duration = enc_video_avs->time_base.den \u002F enc_video_avs->time_base.num \u002F dec_video_avs->avg_frame_rate.num * dec_video_avs->avg_frame_rate.den;\n\n    av_packet_rescale_ts(output_packet, dec_video_avs->time_base, enc_video_avs->time_base);\n    response = av_interleaved_write_frame(avfc, output_packet);\n  }\n  av_packet_unref(output_packet);\n  av_packet_free(&output_packet);\n  return 0;\n}\n\n```\n\nWe converted the media stream from `h264` to `h265`, as expected the `h265` version of the media file is smaller than the `h264` however the [created program](\u002F3_transcoding.c) is capable of:\n\n```c\n\n  \u002F*\n   * H264 -> H265\n   * Audio -> remuxed (untouched)\n   * MP4 - MP4\n   *\u002F\n  StreamingParams sp = {0};\n  sp.copy_audio = 1;\n  sp.copy_video = 0;\n  sp.video_codec = \"libx265\";\n  sp.codec_priv_key = \"x265-params\";\n  sp.codec_priv_value = \"keyint=60:min-keyint=60:scenecut=0\";\n\n  \u002F*\n   * H264 -> H264 (fixed gop)\n   * Audio -> remuxed (untouched)\n   * MP4 - MP4\n   *\u002F\n  StreamingParams sp = {0};\n  sp.copy_audio = 1;\n  sp.copy_video = 0;\n  sp.video_codec = \"libx264\";\n  sp.codec_priv_key = \"x264-params\";\n  sp.codec_priv_value = \"keyint=60:min-keyint=60:scenecut=0:force-cfr=1\";\n\n  \u002F*\n   * H264 -> H264 (fixed gop)\n   * Audio -> remuxed (untouched)\n   * MP4 - fragmented MP4\n   *\u002F\n  StreamingParams sp = {0};\n  sp.copy_audio = 1;\n  sp.copy_video = 0;\n  sp.video_codec = \"libx264\";\n  sp.codec_priv_key = \"x264-params\";\n  sp.codec_priv_value = \"keyint=60:min-keyint=60:scenecut=0:force-cfr=1\";\n  sp.muxer_opt_key = \"movflags\";\n  sp.muxer_opt_value = \"frag_keyframe+empty_moov+delay_moov+default_base_moof\";\n\n  \u002F*\n   * H264 -> H264 (fixed gop)\n   * Audio -> AAC\n   * MP4 - MPEG-TS\n   *\u002F\n  StreamingParams sp = {0};\n  sp.copy_audio = 0;\n  sp.copy_video = 0;\n  sp.video_codec = \"libx264\";\n  sp.codec_priv_key = \"x264-params\";\n  sp.codec_priv_value = \"keyint=60:min-keyint=60:scenecut=0:force-cfr=1\";\n  sp.audio_codec = \"aac\";\n  sp.output_extension = \".ts\";\n\n  \u002F* WIP :P  -> it's not playing on VLC, the final bit rate is huge\n   * H264 -> VP9\n   * Audio -> Vorbis\n   * MP4 - WebM\n   *\u002F\n  \u002F\u002FStreamingParams sp = {0};\n  \u002F\u002Fsp.copy_audio = 0;\n  \u002F\u002Fsp.copy_video = 0;\n  \u002F\u002Fsp.video_codec = \"libvpx-vp9\";\n  \u002F\u002Fsp.audio_codec = \"libvorbis\";\n  \u002F\u002Fsp.output_extension = \".webm\";\n\n```\n\n> Now, to be honest, this was [harder than I thought](https:\u002F\u002Fgithub.com\u002Fleandromoreira\u002Fffmpeg-libav-tutorial\u002Fpull\u002F54) it'd be and I had to dig into the [FFmpeg command line source code](https:\u002F\u002Fgithub.com\u002Fleandromoreira\u002Fffmpeg-libav-tutorial\u002Fpull\u002F54#issuecomment-570746749) and test it a lot and I think I'm missing something because I had to enforce `force-cfr` for the `h264` to work and I'm still seeing some warning messages like `warning messages (forced frame type (5) at 80 was changed to frame type (3))`.\n","该项目是一个FFmpeg libav的教程，旨在帮助开发者从基础到进阶学习媒体处理技术。核心功能包括音视频编码、转封装、转码等操作，并通过C语言示例代码详细讲解了FFmpeg库的使用方法。尽管示例代码主要采用C语言编写，但项目强调这些知识可以轻松地应用于其他支持FFmpeg绑定的语言中。非常适合希望深入了解多媒体文件处理原理及实践的软件工程师或爱好者在实际开发中参考使用。","2026-06-11 03:06:05","top_language"]