[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-4863":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":25,"hasPages":23,"topics":26,"createdAt":10,"pushedAt":10,"updatedAt":34,"readmeContent":35,"aiSummary":36,"trendingCount":16,"starSnapshotCount":16,"syncStatus":37,"lastSyncTime":38,"discoverSource":39},4863,"katana","projectdiscovery\u002Fkatana","projectdiscovery","A next-generation crawling and spidering framework.","",null,"Go",16996,1140,97,3,0,4,45,320,22,107.17,"MIT License",false,"dev",true,[27,28,29,30,31,32,33],"cli","crawler","gocrawler","hacktoberfest","headless","spider-framework","web-spider","2026-06-12 04:00:23","\u003Ch1 align=\"center\">\n  \u003Cimg src=\"https:\u002F\u002Fuser-images.githubusercontent.com\u002F8293321\u002F196779266-421c79d4-643a-4f73-9b54-3da379bbac09.png\" alt=\"katana\" width=\"200px\">\n  \u003Cbr>\n\u003C\u002Fh1>\n\n\u003Ch4 align=\"center\">A next-generation crawling and spidering framework\u003C\u002Fh4>\n\n\u003Cp align=\"center\">\n\u003Ca href=\"https:\u002F\u002Fgoreportcard.com\u002Freport\u002Fgithub.com\u002Fprojectdiscovery\u002Fkatana\">\u003Cimg src=\"https:\u002F\u002Fgoreportcard.com\u002Fbadge\u002Fgithub.com\u002Fprojectdiscovery\u002Fkatana\">\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fprojectdiscovery\u002Fkatana\u002Fissues\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fbadge\u002Fcontributions-welcome-brightgreen.svg?style=flat\">\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fprojectdiscovery\u002Fkatana\u002Freleases\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fgithub\u002Frelease\u002Fprojectdiscovery\u002Fkatana\">\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Ftwitter.com\u002Fpdiscoveryio\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Ftwitter\u002Ffollow\u002Fpdiscoveryio.svg?logo=twitter\">\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fdiscord.gg\u002Fprojectdiscovery\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fdiscord\u002F695645237418131507.svg?logo=discord\">\u003C\u002Fa>\n\u003C\u002Fp>\n\n\u003Cp align=\"center\">\n  \u003Ca href=\"#features\">Features\u003C\u002Fa> •\n  \u003Ca href=\"#installation\">Installation\u003C\u002Fa> •\n  \u003Ca href=\"#usage\">Usage\u003C\u002Fa> •\n  \u003Ca href=\"#scope-control\">Scope\u003C\u002Fa> •\n  \u003Ca href=\"#crawler-configuration\">Config\u003C\u002Fa> •\n  \u003Ca href=\"#filters\">Filters\u003C\u002Fa> •\n  \u003Ca href=\"https:\u002F\u002Fdiscord.gg\u002Fprojectdiscovery\">Join Discord\u003C\u002Fa>\n\u003C\u002Fp>\n\n\n# Features\n\n![image](https:\u002F\u002Fuser-images.githubusercontent.com\u002F8293321\u002F199371558-daba03b6-bf9c-4883-8506-76497c6c3a44.png)\n\n - Fast And fully configurable web crawling\n - **Standard** and **Headless** mode\n - **JavaScript** parsing \u002F crawling\n - Customizable **automatic form filling**\n - **Scope control** - Preconfigured field \u002F Regex \n - **Customizable output** - Preconfigured fields\n - INPUT - **STDIN**, **URL** and **LIST**\n - OUTPUT - **STDOUT**, **FILE** and **JSON**\n\n\n## Installation\n\nkatana requires Go 1.25+ to install successfully. If you encounter any installation issues, we recommend trying with the latest available version of Go, as the minimum required version may have changed. Run the command below or download a pre-compiled binary from the [release page](https:\u002F\u002Fgithub.com\u002Fprojectdiscovery\u002Fkatana\u002Freleases).\n\n```console\nCGO_ENABLED=1 go install github.com\u002Fprojectdiscovery\u002Fkatana\u002Fcmd\u002Fkatana@latest\n```\n\n**More options to install \u002F run katana-**\n\n\u003Cdetails>\n  \u003Csummary>Docker\u003C\u002Fsummary>\n\n> To install \u002F update docker to latest tag -\n\n```sh\ndocker pull projectdiscovery\u002Fkatana:latest\n```\n\n> To run katana in standard mode using docker -\n\n\n```sh\ndocker run projectdiscovery\u002Fkatana:latest -u https:\u002F\u002Ftesla.com\n```\n\n> To run katana in headless mode using docker -\n\n```sh\ndocker run projectdiscovery\u002Fkatana:latest -u https:\u002F\u002Ftesla.com -system-chrome -headless\n```\n\n\u003C\u002Fdetails>\n\n\u003Cdetails>\n  \u003Csummary>Ubuntu\u003C\u002Fsummary>\n\n> It's recommended to install the following prerequisites -\n\n```sh\nsudo apt update\nsudo apt install zip curl wget git snapd\nsudo snap refresh\nsudo snap install golang --classic\n\nsudo install -d -m 0755 \u002Fetc\u002Fapt\u002Fkeyrings\ncurl -fsSL https:\u002F\u002Fdl.google.com\u002Flinux\u002Flinux_signing_key.pub \\\n  | sudo gpg --dearmor -o \u002Fetc\u002Fapt\u002Fkeyrings\u002Fgoogle-chrome.gpg\n\necho \"deb [arch=amd64 signed-by=\u002Fetc\u002Fapt\u002Fkeyrings\u002Fgoogle-chrome.gpg] \\\n  http:\u002F\u002Fdl.google.com\u002Flinux\u002Fchrome\u002Fdeb\u002F stable main\" \\\n  | sudo tee \u002Fetc\u002Fapt\u002Fsources.list.d\u002Fgoogle-chrome.list > \u002Fdev\u002Fnull\n\nsudo apt update\nsudo apt install google-chrome-stable\n```\n\n> install katana -\n\n\n```sh\ngo install github.com\u002Fprojectdiscovery\u002Fkatana\u002Fcmd\u002Fkatana@latest\n```\n\n\u003C\u002Fdetails>\n\n## Usage\n\n```console\nkatana -h\n```\n\nThis will display help for the tool. Here are all the switches it supports.\n\n```console\nKatana is a fast crawler focused on execution in automation\npipelines offering both headless and non-headless crawling.\n\nUsage:\n  .\u002Fkatana [flags]\n\nFlags:\nINPUT:\n   -u, -list string[]     target url \u002F list to crawl\n   -resume string         resume scan using resume.cfg\n   -e, -exclude string[]  exclude host matching specified filter ('cdn', 'private-ips', cidr, ip, regex)\n\nCONFIGURATION:\n   -r, -resolvers string[]       list of custom resolver (file or comma separated)\n   -d, -depth int                maximum depth to crawl (default 3)\n   -jc, -js-crawl                enable endpoint parsing \u002F crawling in javascript file\n   -jsl, -jsluice                enable jsluice parsing in javascript file (memory intensive)\n   -ct, -crawl-duration value    maximum duration to crawl the target for (s, m, h, d) (default s)\n   -kf, -known-files string      enable crawling of known files (all,robotstxt,sitemapxml), a minimum depth of 3 is required to ensure all known files are properly crawled.\n   -mrs, -max-response-size int  maximum response size to read (default 4194304)\n   -timeout int                  time to wait for request in seconds (default 10)\n   -aff, -automatic-form-fill    enable automatic form filling (experimental)\n   -fx, -form-extraction         extract form, input, textarea & select elements in jsonl output\n   -retry int                    number of times to retry the request (default 1)\n   -proxy string                 http\u002Fsocks5 proxy to use\n   -td, -tech-detect             enable technology detection\n   -H, -headers string[]         custom header\u002Fcookie to include in all http request in header:value format (file)\n   -config string                path to the katana configuration file\n   -fc, -form-config string      path to custom form configuration file\n   -flc, -field-config string    path to custom field configuration file\n   -s, -strategy string          Visit strategy (depth-first, breadth-first) (default \"depth-first\")\n   -iqp, -ignore-query-params    Ignore crawling same path with different query-param values\n   -fsu, -filter-similar         filter crawling of similar looking URLs (e.g., \u002Fusers\u002F123 and \u002Fusers\u002F456)\n   -fst, -filter-similar-threshold int  number of distinct values before a path position is treated as parameter (default 10)\n   -tlsi, -tls-impersonate       enable experimental client hello (ja3) tls randomization\n   -dr, -disable-redirects       disable following redirects (default false)\n   -kb, -knowledge-base          enable knowledge base classification\n   -mdp, -max-domain-pages int   maximum number of pages to crawl per domain (default unlimited)\n\nDEBUG:\n   -health-check, -hc        run diagnostic check up\n   -elog, -error-log string  file to write sent requests error log\n   -pprof-server             enable pprof server\n\nHEADLESS:\n   -hl, -headless                    enable headless hybrid crawling (experimental)\n   -sc, -system-chrome               use local installed chrome browser instead of katana installed\n   -sb, -show-browser                show the browser on the screen with headless mode\n   -ho, -headless-options string[]   start headless chrome with additional options\n   -nos, -no-sandbox                 start headless chrome in --no-sandbox mode\n   -cdd, -chrome-data-dir string     path to store chrome browser data\n   -scp, -system-chrome-path string  use specified chrome browser for headless crawling\n   -noi, -no-incognito               start headless chrome without incognito mode\n   -cwu, -chrome-ws-url string       use chrome browser instance launched elsewhere with the debugger listening at this URL\n   -xhr, -xhr-extraction             extract xhr request url,method in jsonl output\n   -pls, -page-load-strategy string  page load strategy (heuristic, load, domcontentloaded, networkidle, none) (default \"heuristic\")\n   -dwt, -dom-wait-time int          time in seconds to wait after page load when using domcontentloaded strategy (default 5)\n   -csp, -captcha-solver-provider string  captcha solver provider (e.g. capsolver)\n   -csk, -captcha-solver-key string       captcha solver provider api key\n\nSCOPE:\n   -cs, -crawl-scope string[]       in scope url regex to be followed by crawler\n   -cos, -crawl-out-scope string[]  out of scope url regex to be excluded by crawler\n   -fs, -field-scope string         pre-defined scope field (dn,rdn,fqdn) or custom regex (e.g., '(company-staging.io|company.com)') (default \"rdn\")\n   -ns, -no-scope                   disables host based default scope\n   -do, -display-out-scope          display external endpoint from scoped crawling\n\nFILTER:\n   -mr, -match-regex string[]             regex or list of regex to match on output url (cli, file)\n   -fr, -filter-regex string[]            regex or list of regex to filter on output url (cli, file)\n   -f, -field string                      field to display in output (url,path,fqdn,rdn,rurl,qurl,qpath,file,ufile,key,value,kv,dir,udir) (Deprecated: use -output-template instead)\n   -sf, -store-field string               field to store in per-host output (url,path,fqdn,rdn,rurl,qurl,qpath,file,ufile,key,value,kv,dir,udir)\n   -em, -extension-match string[]         match output for given extension (eg, -em php,html,js,none)\n   -ef, -extension-filter string[]        filter output for given extension (eg, -ef png,css)\n   -ndef, -no-default-ext-filter bool     remove default extensions from the filter list\n   -mdc, -match-condition string          match response with dsl based condition\n   -fdc, -filter-condition string         filter response with dsl based condition\n   -duf, -disable-unique-filter           disable duplicate content filtering\n   -filter-page-type string[]      filter response with page type (e.g. error,captcha,parked)\n\nRATE-LIMIT:\n   -c, -concurrency int          number of concurrent fetchers to use (default 10)\n   -p, -parallelism int          number of concurrent inputs to process (default 10)\n   -rd, -delay int               request delay between each request in seconds\n   -rl, -rate-limit int          maximum requests to send per second (default 150)\n   -rlm, -rate-limit-minute int  maximum number of requests to send per minute\n   -hrl, -host-rate-limit int    maximum requests to send per second per host\n   -hrlm, -host-rate-limit-minute int  maximum number of requests to send per minute per host\n\nUPDATE:\n   -up, -update                 update katana to latest version\n   -duc, -disable-update-check  disable automatic katana update check\n\nOUTPUT:\n   -o, -output string                file to write output to\n   -output-template string      custom output template\n   -sr, -store-response              store http requests\u002Fresponses\n   -srd, -store-response-dir string  store http requests\u002Fresponses to custom directory\n   -ncb, -no-clobber                 do not overwrite output file\n   -sfd, -store-field-dir string     store per-host field to custom directory\n   -or, -omit-raw                    omit raw requests\u002Fresponses from jsonl output\n   -ob, -omit-body                   omit response body from jsonl output\n   -lof, -list-output-fields         list available fields for jsonl output format\n   -eof, -exclude-output-fields      exclude fields from jsonl output\n   -j, -jsonl                        write output in jsonl format\n   -nc, -no-color                    disable output content coloring (ANSI escape codes)\n   -silent                           display output only\n   -v, -verbose                      display verbose output\n   -debug                            display debug output\n   -version                          display project version\n```\n\n## Running Katana\n\n### Input for katana\n\n**katana** requires **url** or **endpoint** to crawl and accepts single or multiple inputs.\n\nInput URL can be provided using `-u` option, and multiple values can be provided using comma-separated input, similarly **file** input is supported using `-list` option and additionally piped input (stdin) is also supported.\n\n#### URL Input\n\n```sh\nkatana -u https:\u002F\u002Ftesla.com\n```\n\n#### Multiple URL Input (comma-separated)\n\n```sh\nkatana -u https:\u002F\u002Ftesla.com,https:\u002F\u002Fgoogle.com\n```\n\n#### List Input\n```bash\n$ cat url_list.txt\n\nhttps:\u002F\u002Ftesla.com\nhttps:\u002F\u002Fgoogle.com\n```\n\n```\nkatana -list url_list.txt\n```\n\n#### STDIN (piped) Input\n\n```sh\necho https:\u002F\u002Ftesla.com | katana\n```\n\n```sh\ncat domains | httpx | katana\n```\n\nExample running katana -\n\n```console\nkatana -u https:\u002F\u002Fyoutube.com\n\n   __        __                \n  \u002F \u002F_____ _\u002F \u002F____ ____  ___ _\n \u002F  '_\u002F _  \u002F __\u002F _  \u002F _ \\\u002F _  \u002F\n\u002F_\u002F\\_\\\\_,_\u002F\\__\u002F\\_,_\u002F_\u002F\u002F_\u002F\\_,_\u002F v0.0.1                     \n\n      projectdiscovery.io\n\n[WRN] Use with caution. You are responsible for your actions.\n[WRN] Developers assume no liability and are not responsible for any misuse or damage.\nhttps:\u002F\u002Fwww.youtube.com\u002F\nhttps:\u002F\u002Fwww.youtube.com\u002Fabout\u002F\nhttps:\u002F\u002Fwww.youtube.com\u002Fabout\u002Fpress\u002F\nhttps:\u002F\u002Fwww.youtube.com\u002Fabout\u002Fcopyright\u002F\nhttps:\u002F\u002Fwww.youtube.com\u002Ft\u002Fcontact_us\u002F\nhttps:\u002F\u002Fwww.youtube.com\u002Fcreators\u002F\nhttps:\u002F\u002Fwww.youtube.com\u002Fads\u002F\nhttps:\u002F\u002Fwww.youtube.com\u002Ft\u002Fterms\nhttps:\u002F\u002Fwww.youtube.com\u002Ft\u002Fprivacy\nhttps:\u002F\u002Fwww.youtube.com\u002Fabout\u002Fpolicies\u002F\nhttps:\u002F\u002Fwww.youtube.com\u002Fhowyoutubeworks?utm_campaign=ytgen&utm_source=ythp&utm_medium=LeftNav&utm_content=txt&u=https%3A%2F%2Fwww.youtube.com%2Fhowyoutubeworks%3Futm_source%3Dythp%26utm_medium%3DLeftNav%26utm_campaign%3Dytgen\nhttps:\u002F\u002Fwww.youtube.com\u002Fnew\nhttps:\u002F\u002Fm.youtube.com\u002F\nhttps:\u002F\u002Fwww.youtube.com\u002Fs\u002Fdesktop\u002F4965577f\u002Fjsbin\u002Fdesktop_polymer.vflset\u002Fdesktop_polymer.js\nhttps:\u002F\u002Fwww.youtube.com\u002Fs\u002Fdesktop\u002F4965577f\u002Fcssbin\u002Fwww-main-desktop-home-page-skeleton.css\nhttps:\u002F\u002Fwww.youtube.com\u002Fs\u002Fdesktop\u002F4965577f\u002Fcssbin\u002Fwww-onepick.css\nhttps:\u002F\u002Fwww.youtube.com\u002Fs\u002F_\u002Fytmainappweb\u002F_\u002Fss\u002Fk=ytmainappweb.kevlar_base.0Zo5FUcPkCg.L.B1.O\u002Fam=gAE\u002Fd=0\u002Frs=AGKMywG5nh5Qp-BGPbOaI1evhF5BVGRZGA\nhttps:\u002F\u002Fwww.youtube.com\u002Fopensearch?locale=en_GB\nhttps:\u002F\u002Fwww.youtube.com\u002Fmanifest.webmanifest\nhttps:\u002F\u002Fwww.youtube.com\u002Fs\u002Fdesktop\u002F4965577f\u002Fcssbin\u002Fwww-main-desktop-watch-page-skeleton.css\nhttps:\u002F\u002Fwww.youtube.com\u002Fs\u002Fdesktop\u002F4965577f\u002Fjsbin\u002Fweb-animations-next-lite.min.vflset\u002Fweb-animations-next-lite.min.js\nhttps:\u002F\u002Fwww.youtube.com\u002Fs\u002Fdesktop\u002F4965577f\u002Fjsbin\u002Fcustom-elements-es5-adapter.vflset\u002Fcustom-elements-es5-adapter.js\nhttps:\u002F\u002Fwww.youtube.com\u002Fs\u002Fdesktop\u002F4965577f\u002Fjsbin\u002Fwebcomponents-sd.vflset\u002Fwebcomponents-sd.js\nhttps:\u002F\u002Fwww.youtube.com\u002Fs\u002Fdesktop\u002F4965577f\u002Fjsbin\u002Fintersection-observer.min.vflset\u002Fintersection-observer.min.js\nhttps:\u002F\u002Fwww.youtube.com\u002Fs\u002Fdesktop\u002F4965577f\u002Fjsbin\u002Fscheduler.vflset\u002Fscheduler.js\nhttps:\u002F\u002Fwww.youtube.com\u002Fs\u002Fdesktop\u002F4965577f\u002Fjsbin\u002Fwww-i18n-constants-en_GB.vflset\u002Fwww-i18n-constants.js\nhttps:\u002F\u002Fwww.youtube.com\u002Fs\u002Fdesktop\u002F4965577f\u002Fjsbin\u002Fwww-tampering.vflset\u002Fwww-tampering.js\nhttps:\u002F\u002Fwww.youtube.com\u002Fs\u002Fdesktop\u002F4965577f\u002Fjsbin\u002Fspf.vflset\u002Fspf.js\nhttps:\u002F\u002Fwww.youtube.com\u002Fs\u002Fdesktop\u002F4965577f\u002Fjsbin\u002Fnetwork.vflset\u002Fnetwork.js\nhttps:\u002F\u002Fwww.youtube.com\u002Fhowyoutubeworks\u002F\nhttps:\u002F\u002Fwww.youtube.com\u002Ftrends\u002F\nhttps:\u002F\u002Fwww.youtube.com\u002Fjobs\u002F\nhttps:\u002F\u002Fwww.youtube.com\u002Fkids\u002F\n```\n\n\n## Crawling Mode\n\n### Standard Mode\n\nStandard crawling modality uses the standard go http library under the hood to handle HTTP requests\u002Fresponses. This modality is much faster as it doesn't have the browser overhead. Still, it analyzes HTTP responses body as is, without any javascript or DOM rendering, potentially missing post-dom-rendered endpoints or asynchronous endpoint calls that might happen in complex web applications depending, for example, on browser-specific events.\n\n### Headless Mode\n\nHeadless mode hooks internal headless calls to handle HTTP requests\u002Fresponses directly within the browser context. This offers two advantages:\n- The HTTP fingerprint (TLS and user agent) fully identify the client as a legitimate browser\n- Better coverage since the endpoints are discovered analyzing the standard raw response, as in the previous modality, and also the browser-rendered one with javascript enabled.\n\nHeadless crawling is optional and can be enabled using `-headless` option.\n\nHere are other headless CLI options -\n\n```console\nkatana -h headless\n\nFlags:\nHEADLESS:\n   -hl, -headless                    enable headless hybrid crawling (experimental)\n   -sc, -system-chrome               use local installed chrome browser instead of katana installed\n   -sb, -show-browser                show the browser on the screen with headless mode\n   -ho, -headless-options string[]   start headless chrome with additional options\n   -nos, -no-sandbox                 start headless chrome in --no-sandbox mode\n   -cdd, -chrome-data-dir string     path to store chrome browser data\n   -scp, -system-chrome-path string  use specified chrome browser for headless crawling\n   -noi, -no-incognito               start headless chrome without incognito mode\n   -cwu, -chrome-ws-url string       use chrome browser instance launched elsewhere with the debugger listening at this URL\n   -xhr, -xhr-extraction             extract xhr requests\n   -pls, -page-load-strategy string  page load strategy (heuristic, load, domcontentloaded, networkidle, none) (default \"heuristic\")\n   -dwt, -dom-wait-time int          time in seconds to wait after page load when using domcontentloaded strategy (default 5)\n   -csp, -captcha-solver-provider string  captcha solver provider (e.g. capsolver)\n   -csk, -captcha-solver-key string       captcha solver provider api key\n```\n\n*`-no-sandbox`*\n----\n\nRuns headless chrome browser with **no-sandbox** option, useful when running as root user.\n\n```console\nkatana -u https:\u002F\u002Ftesla.com -headless -no-sandbox\n```\n\n*`-no-incognito`*\n----\n\nRuns headless chrome browser without incognito mode, useful when using the local browser.\n\n```console\nkatana -u https:\u002F\u002Ftesla.com -headless -no-incognito\n```\n\nTo preserve cookies and other browser session data across runs, combine `-no-incognito` with `-chrome-data-dir` so Katana reuses your chosen Chrome profile directory instead of an isolated temporary one.\n\n```console\nkatana -u https:\u002F\u002Ftesla.com -headless -no-incognito -chrome-data-dir \u002Ftmp\u002Fkatana-profile\n```\n\n*`-headless-options`*\n----\n\nWhen crawling in headless mode, additional chrome options can be specified using `-headless-options`, for example -\n\n\n```console\nkatana -u https:\u002F\u002Ftesla.com -headless -system-chrome -headless-options --disable-gpu,proxy-server=http:\u002F\u002F127.0.0.1:8080\n```\n\n*`-page-load-strategy`*\n----\n\nControls how katana waits for pages to load in headless mode. Different strategies are useful for different types of web applications:\n\n| Strategy | Description |\n|----------|-------------|\n| `heuristic` | (default) Smart waiting that adapts to page behavior - waits for load event, network idle, and DOM stability |\n| `load` | Waits only for the browser's load event |\n| `domcontentloaded` | Waits for DOMContentLoaded event plus additional time (configurable via `-dwt`) for JavaScript rendering |\n| `networkidle` | Waits for network activity to stop |\n| `none` | No waiting - returns immediately after navigation starts |\n\n```console\nkatana -u https:\u002F\u002Ftesla.com -headless -pls domcontentloaded\n```\n\nThe `domcontentloaded` strategy is particularly useful for Single Page Applications (SPAs) that never fully complete loading due to continuous background requests (websockets, polling, etc.).\n\n*`-dom-wait-time`*\n----\n\nWhen using the `domcontentloaded` page load strategy, this option specifies how many seconds to wait after the DOMContentLoaded event fires. This allows time for JavaScript to render interactive elements. Default is 5 seconds.\n\n```console\nkatana -u https:\u002F\u002Ftesla.com -headless -pls domcontentloaded -dwt 10\n```\n\n\n### Captcha Solving\n\nKatana supports automatic captcha detection and solving during headless crawling. When a captcha page is encountered, katana identifies the captcha provider, solves it via an external service, and continues crawling.\n\nSupported captcha types: **reCAPTCHA v2**, **reCAPTCHA v3**, **reCAPTCHA Enterprise**, **Cloudflare Turnstile**, **hCaptcha**\n\n*`-captcha-solver-provider`*\n----\n\nOption to specify the captcha solver provider. Currently supported: `capsolver`.\n\n*`-captcha-solver-key`*\n----\n\nAPI key for the captcha solver provider.\n\n```console\nkatana -u https:\u002F\u002Fexample.com -headless -csp capsolver -csk YOUR_API_KEY\n```\n\nThe provider and key can also be set via environment variables:\n\n```console\nexport CAPTCHA_SOLVER_PROVIDER=capsolver\nexport CAPTCHA_SOLVER_KEY=YOUR_API_KEY\nkatana -u https:\u002F\u002Fexample.com -headless\n```\n\n## Scope Control\n\nCrawling can be endless if not scoped, as such katana comes with multiple support to define the crawl scope.\n\n*`-field-scope`*\n----\nMost handy option to define scope with predefined field name, `rdn` being default option for field scope.\n\n   - `rdn` - crawling scoped to root domain name and all subdomains (e.g. `*example.com`) (default)\n   - `fqdn` - crawling scoped to given sub(domain) (e.g. `www.example.com` or `api.example.com`)\n   - `dn` - crawling scoped to domain name keyword (e.g. `example`)\n\n```console\nkatana -u https:\u002F\u002Ftesla.com -fs dn\n```\n\n\n*`-crawl-scope`*\n------\n\nFor advanced scope control, `-cs` option can be used that comes with **regex** support.\n\n```console\nkatana -u https:\u002F\u002Ftesla.com -cs login\n```\n\nFor multiple in scope rules, file input with multiline string \u002F regex can be passed.\n\n```bash\n$ cat in_scope.txt\n\nlogin\u002F\nadmin\u002F\napp\u002F\nwordpress\u002F\n```\n\n```console\nkatana -u https:\u002F\u002Ftesla.com -cs in_scope.txt\n```\n\n\n*`-crawl-out-scope`*\n-----\n\nFor defining what not to crawl, `-cos` option can be used and also support **regex** input.\n\n```console\nkatana -u https:\u002F\u002Ftesla.com -cos logout\n```\n\nFor multiple out of scope rules, file input with multiline string \u002F regex can be passed.\n\n```bash\n$ cat out_of_scope.txt\n\n\u002Flogout\n\u002Flog_out\n```\n\n```console\nkatana -u https:\u002F\u002Ftesla.com -cos out_of_scope.txt\n```\n\n*`-no-scope`*\n----\n\nKatana is default to scope `*.domain`, to disable this `-ns` option can be used and also to crawl the internet.\n\n```console\nkatana -u https:\u002F\u002Ftesla.com -ns\n```\n\n*`-display-out-scope`*\n----\n\nAs default, when scope option is used, it also applies for the links to display as output, as such **external URLs are default to exclude** and to overwrite this behavior, `-do` option can be used to display all the external URLs that exist in targets scoped URL \u002F Endpoint.\n\n```\nkatana -u https:\u002F\u002Ftesla.com -do\n```\n\nHere is all the CLI options for the scope control -\n\n\n```console\nkatana -h scope\n\nFlags:\nSCOPE:\n   -cs, -crawl-scope string[]       in scope url regex to be followed by crawler\n   -cos, -crawl-out-scope string[]  out of scope url regex to be excluded by crawler\n   -fs, -field-scope string         pre-defined scope field (dn,rdn,fqdn) (default \"rdn\")\n   -ns, -no-scope                   disables host based default scope\n   -do, -display-out-scope          display external endpoint from scoped crawling\n```\n\n\n## Crawler Configuration\n\nKatana comes with multiple options to configure and control the crawl as the way we want.\n\n*`-depth`*\n----\n\nOption to define the `depth` to follow the urls for crawling, the more depth the more number of endpoint being crawled + time for crawl.\n\n```\nkatana -u https:\u002F\u002Ftesla.com -d 5\n```\n\n*`-js-crawl`*\n----\n\nOption to enable JavaScript file parsing + crawling the endpoints discovered in JavaScript files, disabled as default.\n\n```\nkatana -u https:\u002F\u002Ftesla.com -jc\n```\n\n*`-crawl-duration`*\n----\n\nOption to predefined crawl duration, disabled as default.\n\n```\nkatana -u https:\u002F\u002Ftesla.com -ct 2\n```\n\n*`-known-files`*\n----\nOption to enable crawling `robots.txt` and `sitemap.xml` file, disabled as default.\n\n```\nkatana -u https:\u002F\u002Ftesla.com -kf robotstxt,sitemapxml\n```\n\n*`-automatic-form-fill`*\n----\n\nOption to enable automatic form filling for known \u002F unknown fields, known field values can be customized as needed by updating form config file at `$HOME\u002F.config\u002Fkatana\u002Fform-config.yaml`.\n\nAutomatic form filling is experimental feature.\n\n```\nkatana -u https:\u002F\u002Ftesla.com -aff\n```\n\nForm config values support DSL helper functions for dynamic data generation. All `rand_*` functions from the [projectdiscovery\u002Fdsl](https:\u002F\u002Fgithub.com\u002Fprojectdiscovery\u002Fdsl) library are available:\n\n```yaml\n# $HOME\u002F.config\u002Fkatana\u002Fform-config.yaml\nemail: \"rand_email()\"\nphone: \"rand_phone()\"\nplaceholder: \"rand_first_name()\"\npassword: 'rand_base(16, \"\")'\ncolor: \"#e66465\"\n```\n\n*`-filter-similar`*\n----\n\nOption to filter crawling of similar looking URLs by normalizing variable path segments. This detects IDs, UUIDs, hashes, dates, and other dynamic values, and also learns repeating patterns at runtime. For example, `\u002Fusers\u002F123` and `\u002Fusers\u002F456` are treated as the same endpoint.\n\n```\nkatana -u https:\u002F\u002Ftesla.com -fsu\n```\n\nThe promotion threshold (how many distinct values at a path position before it's treated as a parameter) can be tuned with `-fst`. Lower values are more aggressive (fewer URLs crawled), higher values are more permissive. Default is `10`.\n\n```\nkatana -u https:\u002F\u002Ftesla.com -fsu -fst 5\n```\n\n*`-max-domain-pages`*\n----\n\nOption to limit the number of pages crawled per domain. Prevents any single domain from consuming the entire crawl budget, useful for large sites or crawler trap protection.\n\n```\nkatana -u https:\u002F\u002Ftesla.com -mdp 100\n```\n\n## Authenticated Crawling\n\nAuthenticated crawling involves including custom headers or cookies in HTTP requests to access protected resources. These headers provide authentication or authorization information, allowing you to crawl authenticated content \u002F endpoint. You can specify headers directly in the command line or provide them as a file with katana to perform authenticated crawling.\n\n> **Note**: User needs to be manually perform the authentication and export the session cookie \u002F header to file to use with katana.\n\n*`-headers`*\n----\n\nOption to add a custom header or cookie to the request. \n> Syntax of [headers](https:\u002F\u002Fdatatracker.ietf.org\u002Fdoc\u002Fhtml\u002Frfc7230#section-3.2) in the HTTP specification\n\nHere is an example of adding a cookie to the request:\n```\nkatana -u https:\u002F\u002Ftesla.com -H 'Cookie: usrsess=AmljNrESo'\n```\n\nIt is also possible to supply headers or cookies as a file. For example:\n\n```\n$ cat cookie.txt\n\nCookie: PHPSESSIONID=XXXXXXXXX\nX-API-KEY: XXXXX\nTOKEN=XX\n```\n\n```\nkatana -u https:\u002F\u002Ftesla.com -H cookie.txt\n```\n\n\nThere are more options to configure when needed, here is all the config related CLI options - \n\n```console\nkatana -h config\n\nFlags:\nCONFIGURATION:\n   -r, -resolvers string[]       list of custom resolver (file or comma separated)\n   -d, -depth int                maximum depth to crawl (default 3)\n   -jc, -js-crawl                enable endpoint parsing \u002F crawling in javascript file\n   -ct, -crawl-duration int      maximum duration to crawl the target for\n   -kf, -known-files string      enable crawling of known files (all,robotstxt,sitemapxml)\n   -mrs, -max-response-size int  maximum response size to read (default 9223372036854775807)\n   -timeout int                  time to wait for request in seconds (default 10)\n   -aff, -automatic-form-fill    enable automatic form filling (experimental)\n   -fx, -form-extraction         enable extraction of form, input, textarea & select elements\n   -retry int                    number of times to retry the request (default 1)\n   -proxy string                 http\u002Fsocks5 proxy to use\n   -H, -headers string[]         custom header\u002Fcookie to include in request\n   -config string                path to the katana configuration file\n   -fc, -form-config string      path to custom form configuration file\n   -flc, -field-config string    path to custom field configuration file\n   -s, -strategy string          Visit strategy (depth-first, breadth-first) (default \"depth-first\")\n   -iqp, -ignore-query-params    Ignore crawling same path with different query-param values\n   -fsu, -filter-similar         filter crawling of similar looking URLs (e.g., \u002Fusers\u002F123 and \u002Fusers\u002F456)\n   -fst, -filter-similar-threshold int  number of distinct values before a path position is treated as parameter (default 10)\n   -mdp, -max-domain-pages int   maximum number of pages to crawl per domain (default unlimited)\n```\n\n### Connecting to Active Browser Session\n\nKatana can also connect to active browser session where user is already logged in and authenticated. and use it for crawling. The only requirement for this is to start browser with remote debugging enabled.\n\nHere is an example of starting chrome browser with remote debugging enabled and using it with katana -\n\n**step 1) First Locate path of chrome executable**\n\n| Operating System | Chromium Executable Location | Google Chrome Executable Location |\n|------------------|------------------------------|-----------------------------------|\n| Windows (64-bit) | `C:\\Program Files (x86)\\Google\\Chromium\\Application\\chrome.exe` | `C:\\Program Files (x86)\\Google\\Chrome\\Application\\chrome.exe` |\n| Windows (32-bit) | `C:\\Program Files\\Google\\Chromium\\Application\\chrome.exe` | `C:\\Program Files\\Google\\Chrome\\Application\\chrome.exe` |\n| macOS | `\u002FApplications\u002FChromium.app\u002FContents\u002FMacOS\u002FChromium` | `\u002FApplications\u002FGoogle Chrome.app\u002FContents\u002FMacOS\u002FGoogle Chrome` |\n| Linux | `\u002Fusr\u002Fbin\u002Fchromium` | `\u002Fusr\u002Fbin\u002Fgoogle-chrome` |\n\n**step 2) Start chrome with remote debugging enabled and it will return websocker url. For example, on MacOS, you can start chrome with remote debugging enabled using following command** -\n\n```console\n$ \u002FApplications\u002FGoogle\\ Chrome.app\u002FContents\u002FMacOS\u002FGoogle\\ Chrome --remote-debugging-port=9222\n\n\nDevTools listening on ws:\u002F\u002F127.0.0.1:9222\u002Fdevtools\u002Fbrowser\u002Fc5316c9c-19d6-42dc-847a-41d1aeebf7d6\n```\n\n> Now login to the website you want to crawl and keep the browser open.\n\n**step 3) Now use the websocket url with katana to connect to the active browser session and crawl the website**\n\n```console\nkatana -headless -u https:\u002F\u002Ftesla.com -cwu ws:\u002F\u002F127.0.0.1:9222\u002Fdevtools\u002Fbrowser\u002Fc5316c9c-19d6-42dc-847a-41d1aeebf7d6 -no-incognito\n```\n\n> **Note**: you can use `-cdd` option to specify custom chrome data directory to store browser data and cookies but that does not save session data if cookie is set to `Session` only or expires after certain time.\n\n\n## Filters\n\n*`-field`*\n----\n\n> [!WARNING]\n> Deprecated: use [**`-output-template`**](#-output-template) instead. The field flag is still supported for backward compatibility.\n\nKatana comes with built in fields that can be used to filter the output for the desired information, `-f` option can be used to specify any of the available fields.\n\n```\n   -f, -field string  field to display in output (url,path,fqdn,rdn,rurl,qurl,qpath,file,key,value,kv,dir,udir)\n```\n\nHere is a table with examples of each field and expected output when used - \n\n\n| FIELD   | DESCRIPTION                 | EXAMPLE                                                      |\n| ------- | --------------------------- | ------------------------------------------------------------ |\n| `url`   | URL Endpoint                | `https:\u002F\u002Fadmin.projectdiscovery.io\u002Fadmin\u002Flogin?user=admin&password=admin` |\n| `qurl`  | URL including query param   | `https:\u002F\u002Fadmin.projectdiscovery.io\u002Fadmin\u002Flogin.php?user=admin&password=admin` |\n| `qpath` | Path including query param  | `\u002Flogin?user=admin&password=admin`                           |\n| `path`  | URL Path                    | `https:\u002F\u002Fadmin.projectdiscovery.io\u002Fadmin\u002Flogin`              |\n| `fqdn`  | Fully Qualified Domain name | `admin.projectdiscovery.io`                                  |\n| `rdn`   | Root Domain name            | `projectdiscovery.io`                                        |\n| `rurl`  | Root URL                    | `https:\u002F\u002Fadmin.projectdiscovery.io`                          |\n| `ufile` | URL with File               | `https:\u002F\u002Fadmin.projectdiscovery.io\u002Flogin.js`                 |\n| `file`  | Filename in URL             | `login.php`                                                  |\n| `key`   | Parameter keys in URL       | `user,password`                                              |\n| `value` | Parameter values in URL     | `admin,admin`                                                |\n| `kv`    | Keys=Values in URL          | `user=admin&password=admin`                                  |\n| `dir`   | URL Directory name          | `\u002Fadmin\u002F`                                                    |\n| `udir`  | URL with Directory          | `https:\u002F\u002Fadmin.projectdiscovery.io\u002Fadmin\u002F`                   |\n\nHere is an example of using field option to only display all the urls with query parameter in it -\n\n```\nkatana -u https:\u002F\u002Ftesla.com -f qurl -silent\n\nhttps:\u002F\u002Fshop.tesla.com\u002Fen_au?redirect=no\nhttps:\u002F\u002Fshop.tesla.com\u002Fen_nz?redirect=no\nhttps:\u002F\u002Fshop.tesla.com\u002Fproduct\u002Fmen_s-raven-lightweight-zip-up-bomber-jacket?sku=1740250-00-A\nhttps:\u002F\u002Fshop.tesla.com\u002Fproduct\u002Ftesla-shop-gift-card?sku=1767247-00-A\nhttps:\u002F\u002Fshop.tesla.com\u002Fproduct\u002Fmen_s-chill-crew-neck-sweatshirt?sku=1740176-00-A\nhttps:\u002F\u002Fwww.tesla.com\u002Fabout?redirect=no\nhttps:\u002F\u002Fwww.tesla.com\u002Fabout\u002Flegal?redirect=no\nhttps:\u002F\u002Fwww.tesla.com\u002Ffindus\u002Flist?redirect=no\n```\n\n### Custom Fields\n\nYou can create custom fields to extract and store specific information from page responses using regex rules. These custom fields are defined using a YAML config file and are loaded from the default location at `$HOME\u002F.config\u002Fkatana\u002Ffield-config.yaml`. Alternatively, you can use the `-flc` option to load a custom field config file from a different location.\nHere is example custom field.\n\n```yaml\n- name: email\n  type: regex\n  regex:\n  - '([a-zA-Z0-9._-]+@[a-zA-Z0-9._-]+\\.[a-zA-Z0-9_-]+)'\n  - '([a-zA-Z0-9+._-]+@[a-zA-Z0-9._-]+\\.[a-zA-Z0-9_-]+)'\n\n- name: phone\n  type: regex\n  regex:\n  - '\\d{3}-\\d{8}|\\d{4}-\\d{7}'\n```\n\nWhen defining custom fields, following attributes are supported:\n\n- **name** (required)\n\n> The value of **name** attribute is used as the `-field` cli option value.\n\n- **type** (required)\n\n> The type of custom attribute, currently supported option - `regex` \n\n- **part** (optional)\n\n> The part of the response to extract the information from. The default value is `response`, which includes both the header and body. Other possible values are `header` and `body`.\n\n- group (optional)\n\n> You can use this attribute to select a specific matched group in regex, for example: `group: 1`\n\n#### Running katana using custom field:\n\n```console\nkatana -u https:\u002F\u002Ftesla.com -f email,phone\n```\n\n*`-store-field`*\n---\n\nTo compliment `field` option which is useful to filter output at run time, there is `-sf, -store-fields` option which works exactly like field option except instead of filtering, it stores all the information on the disk under `katana_field` directory sorted by target url. Use `-sfd` or `-store-field-dir` to store data in a different location.\n\n```\nkatana -u https:\u002F\u002Ftesla.com -sf key,fqdn,qurl -silent\n```\n\n```bash\n$ ls katana_field\u002F\n\nhttps_www.tesla.com_fqdn.txt\nhttps_www.tesla.com_key.txt\nhttps_www.tesla.com_qurl.txt\n```\n\nThe `-store-field` option can be useful for collecting information to build a targeted wordlist for various purposes, including but not limited to:\n\n- Identifying the most commonly used parameters\n- Discovering frequently used paths\n- Finding commonly used files\n- Identifying related or unknown subdomains\n\n### Katana Filters\n\n*`-extension-match`*\n---\n\nCrawl output can be easily matched for specific extension using `-em` option to ensure to display only output containing given extension.\n\n```\nkatana -u https:\u002F\u002Ftesla.com -silent -em js,jsp,json\n```\n\nUse the special value `none` to also include URLs without a file extension in the output:\n\n```\nkatana -u https:\u002F\u002Ftesla.com -silent -em js,jsp,json,none\n```\n\n*`-extension-filter`*\n---\n\nCrawl output can be easily filtered for specific extension using `-ef` option which ensure to remove all the urls containing given extension.\n\n```\nkatana -u https:\u002F\u002Ftesla.com -silent -ef css,txt,md\n\n```\n*`-no-default-ext-filter`*\n---\n\nKatana filters several extensions by default. This can be disabled with the `-ndef` option.\n\n```\nkatana -u https:\u002F\u002Ftesla.com -silent -ndef\n```\n\n*`-match-regex`*\n---\nThe `-match-regex` or `-mr` flag allows you to filter output URLs using regular expressions. When using this flag, only URLs that match the specified regular expression will be printed in the output.\n\n```\nkatana -u https:\u002F\u002Ftesla.com -mr 'https:\u002F\u002Fshop\\.tesla\\.com\u002F*' -silent\n```\n*`-filter-regex`*\n---\nThe `-filter-regex` or `-fr` flag allows you to filter output URLs using regular expressions. When using this flag, it will skip the URLs that are match the specified regular expression.\n\n```\nkatana -u https:\u002F\u002Ftesla.com -fr 'https:\u002F\u002Fwww\\.tesla\\.com\u002F*' -silent\n```\n\n### Advance Filtering\n\nKatana supports DSL-based expressions for advanced matching and filtering capabilities:\n\n- To match endpoints with a 200 status code:\n```shell\nkatana -u https:\u002F\u002Fwww.hackerone.com -mdc 'status_code == 200'\n```\n- To match endpoints that contain \"default\" and have a status code other than 403:\n```shell\nkatana -u https:\u002F\u002Fwww.hackerone.com -mdc 'contains(endpoint, \"default\") && status_code != 403'\n```\n- To match endpoints with PHP technologies:\n```shell\nkatana -u https:\u002F\u002Fwww.hackerone.com -mdc 'contains(to_lower(technologies), \"php\")'\n```\n- To filter out endpoints running on Cloudflare:\n```shell\nkatana -u https:\u002F\u002Fwww.hackerone.com -fdc 'contains(to_lower(technologies), \"cloudflare\")'\n```\nDSL functions can be applied to any keys in the jsonl output. For more information on available DSL functions, please visit the [dsl project](https:\u002F\u002Fgithub.com\u002Fprojectdiscovery\u002Fdsl).\n\nHere are additional filter options -\n\n```console\nkatana -h filter\n\nFlags:\nFILTER:\n   -mr, -match-regex string[]             regex or list of regex to match on output url (cli, file)\n   -fr, -filter-regex string[]            regex or list of regex to filter on output url (cli, file)\n   -f, -field string                      field to display in output (url,path,fqdn,rdn,rurl,qurl,qpath,file,ufile,key,value,kv,dir,udir)\n   -sf, -store-field string               field to store in per-host output (url,path,fqdn,rdn,rurl,qurl,qpath,file,ufile,key,value,kv,dir,udir)\n   -em, -extension-match string[]         match output for given extension (eg, -em php,html,js,none)\n   -ef, -extension-filter string[]        filter output for given extension (eg, -ef png,css)\n   -ndef, -no-default-ext-filter bool     remove default extensions from the filter list\n   -mdc, -match-condition string          match response with dsl based condition\n   -fdc, -filter-condition string         filter response with dsl based condition\n   -duf, -disable-unique-filter           disable duplicate content filtering\n```\n\n\n## Rate Limit\n\nIt's easy to get blocked \u002F banned while crawling if not following target websites limits, katana comes with multiple option to tune the crawl to go as fast \u002F slow we want.\n\n*`-delay`*\n-----\n\noption to introduce a delay in seconds between each new request katana makes while crawling, disabled as default.\n\n```\nkatana -u https:\u002F\u002Ftesla.com -delay 20\n```\n\n*`-concurrency`*\n-----\noption to control the number of urls per target to fetch at the same time.\n\n```\nkatana -u https:\u002F\u002Ftesla.com -c 20\n```\n\n\n*`-parallelism`*\n-----\noption to define number of target to process at same time from list input.\n\n```\nkatana -u https:\u002F\u002Ftesla.com -p 20\n```\n\n*`-rate-limit`*\n-----\nMaximum requests per second, applied globally across all hosts.\n\n```\nkatana -u https:\u002F\u002Ftesla.com -rl 100\n```\n\n*`-rate-limit-minute`*\n-----\nMaximum requests per minute, applied globally across all hosts.\n\n```\nkatana -u https:\u002F\u002Ftesla.com -rlm 500\n```\n\n*`-host-rate-limit`*\n-----\nMaximum requests per second per host. Each host gets its own rate limit bucket, so a slow host won't throttle fast ones. Replaces the global rate limit when set. Katana also backs off automatically with exponential delay and jitter when a host returns 429 or 503.\n\n```console\nkatana -u https:\u002F\u002Ftesla.com -hrl 50\n```\n\n*`-host-rate-limit-minute`*\n-----\nMaximum requests per minute per host.\n\n```console\nkatana -u https:\u002F\u002Ftesla.com -hrlm 200\n```\n\nHere is all long \u002F short CLI options for rate limit control -\n\n```console\nkatana -h rate-limit\n\nFlags:\nRATE-LIMIT:\n   -c, -concurrency int          number of concurrent fetchers to use (default 10)\n   -p, -parallelism int          number of concurrent inputs to process (default 10)\n   -rd, -delay int               request delay between each request in seconds\n   -rl, -rate-limit int          maximum requests to send per second (default 150)\n   -rlm, -rate-limit-minute int  maximum number of requests to send per minute\n   -hrl, -host-rate-limit int    maximum requests to send per second per host\n   -hrlm, -host-rate-limit-minute int  maximum number of requests to send per minute per host\n```\n\n## Output\n\nKatana support both file output in plain text format as well as JSON which includes additional information like, `source`, `tag`, and `attribute` name to co-related the discovered endpoint.\n\n*`-output`*\n---\n\nBy default, katana outputs the crawled endpoints in plain text format. The results can be written to a file by using the -output option.\n\n\n```console\nkatana -u https:\u002F\u002Fexample.com -no-scope -output example_endpoints.txt\n```\n\n*`-output-template`*\n---\n\nThe `-output-template` option allows you to customize the output format using template, providing flexibility in defining the output structure. This option replaces the deprecated `-field` flag for filtering output. Instead of relying on predefined fields, you can specify a custom template directly in the command line to control how the extracted data is presented.\n\nExample of using the `-output-template` option:\n\n```sh\nkatana -u https:\u002F\u002Fexample.com -output-template '{{email}} - {{url}}'\n```\n\nIn this example, `email` represents a [custom field](#custom-fields) that extracts and displays email addresses found within the source `url`.\n\n> [!NOTE]\n> If a specified field does not exist or does not contain a value, it will simply be omitted from the output.\n\nThis option can effectively structure the output in a way that best suits your use case, making data extraction more intuitive and customizable.\n\n*`-jsonl`*\n---\n\n```console\nkatana -u https:\u002F\u002Fexample.com -jsonl | jq .\n```\n\n```json\n{\n  \"timestamp\": \"2023-03-20T16:23:58.027559+05:30\",\n  \"request\": {\n    \"method\": \"GET\",\n    \"endpoint\": \"https:\u002F\u002Fexample.com\",\n    \"raw\": \"GET \u002F HTTP\u002F1.1\\r\\nHost: example.com\\r\\nUser-Agent: Mozilla\u002F5.0 (Macintosh; Intel Mac OS X 11_1) AppleWebKit\u002F537.36 (KHTML, like Gecko) Chrome\u002F87.0.4280.88 Safari\u002F537.36\\r\\nAccept-Encoding: gzip\\r\\n\\r\\n\"\n  },\n  \"response\": {\n    \"status_code\": 200,\n    \"headers\": {\n      \"accept_ranges\": \"bytes\",\n      \"expires\": \"Mon, 27 Mar 2023 10:53:58 GMT\",\n      \"last_modified\": \"Thu, 17 Oct 2019 07:18:26 GMT\",\n      \"content_type\": \"text\u002Fhtml; charset=UTF-8\",\n      \"server\": \"ECS (dcb\u002F7EA3)\",\n      \"vary\": \"Accept-Encoding\",\n      \"etag\": \"\\\"3147526947\\\"\",\n      \"cache_control\": \"max-age=604800\",\n      \"x_cache\": \"HIT\",\n      \"date\": \"Mon, 20 Mar 2023 10:53:58 GMT\",\n      \"age\": \"331239\"\n    },\n    \"body\": \"\u003C!doctype html>\\n\u003Chtml>\\n\u003Chead>\\n    \u003Ctitle>Example Domain\u003C\u002Ftitle>\\n\\n    \u003Cmeta charset=\\\"utf-8\\\" \u002F>\\n    \u003Cmeta http-equiv=\\\"Content-type\\\" content=\\\"text\u002Fhtml; charset=utf-8\\\" \u002F>\\n    \u003Cmeta name=\\\"viewport\\\" content=\\\"width=device-width, initial-scale=1\\\" \u002F>\\n    \u003Cstyle type=\\\"text\u002Fcss\\\">\\n    body {\\n        background-color: #f0f0f2;\\n        margin: 0;\\n        padding: 0;\\n        font-family: -apple-system, system-ui, BlinkMacSystemFont, \\\"Segoe UI\\\", \\\"Open Sans\\\", \\\"Helvetica Neue\\\", Helvetica, Arial, sans-serif;\\n        \\n    }\\n    div {\\n        width: 600px;\\n        margin: 5em auto;\\n        padding: 2em;\\n        background-color: #fdfdff;\\n        border-radius: 0.5em;\\n        box-shadow: 2px 3px 7px 2px rgba(0,0,0,0.02);\\n    }\\n    a:link, a:visited {\\n        color: #38488f;\\n        text-decoration: none;\\n    }\\n    @media (max-width: 700px) {\\n        div {\\n            margin: 0 auto;\\n            width: auto;\\n        }\\n    }\\n    \u003C\u002Fstyle>    \\n\u003C\u002Fhead>\\n\\n\u003Cbody>\\n\u003Cdiv>\\n    \u003Ch1>Example Domain\u003C\u002Fh1>\\n    \u003Cp>This domain is for use in illustrative examples in documents. You may use this\\n    domain in literature without prior coordination or asking for permission.\u003C\u002Fp>\\n    \u003Cp>\u003Ca href=\\\"https:\u002F\u002Fwww.iana.org\u002Fdomains\u002Fexample\\\">More information...\u003C\u002Fa>\u003C\u002Fp>\\n\u003C\u002Fdiv>\\n\u003C\u002Fbody>\\n\u003C\u002Fhtml>\\n\",\n    \"technologies\": [\n      \"Azure\",\n      \"Amazon ECS\",\n      \"Amazon Web Services\",\n      \"Docker\",\n      \"Azure CDN\"\n    ],\n    \"raw\": \"HTTP\u002F1.1 200 OK\\r\\nContent-Length: 1256\\r\\nAccept-Ranges: bytes\\r\\nAge: 331239\\r\\nCache-Control: max-age=604800\\r\\nContent-Type: text\u002Fhtml; charset=UTF-8\\r\\nDate: Mon, 20 Mar 2023 10:53:58 GMT\\r\\nEtag: \\\"3147526947\\\"\\r\\nExpires: Mon, 27 Mar 2023 10:53:58 GMT\\r\\nLast-Modified: Thu, 17 Oct 2019 07:18:26 GMT\\r\\nServer: ECS (dcb\u002F7EA3)\\r\\nVary: Accept-Encoding\\r\\nX-Cache: HIT\\r\\n\\r\\n\u003C!doctype html>\\n\u003Chtml>\\n\u003Chead>\\n    \u003Ctitle>Example Domain\u003C\u002Ftitle>\\n\\n    \u003Cmeta charset=\\\"utf-8\\\" \u002F>\\n    \u003Cmeta http-equiv=\\\"Content-type\\\" content=\\\"text\u002Fhtml; charset=utf-8\\\" \u002F>\\n    \u003Cmeta name=\\\"viewport\\\" content=\\\"width=device-width, initial-scale=1\\\" \u002F>\\n    \u003Cstyle type=\\\"text\u002Fcss\\\">\\n    body {\\n        background-color: #f0f0f2;\\n        margin: 0;\\n        padding: 0;\\n        font-family: -apple-system, system-ui, BlinkMacSystemFont, \\\"Segoe UI\\\", \\\"Open Sans\\\", \\\"Helvetica Neue\\\", Helvetica, Arial, sans-serif;\\n        \\n    }\\n    div {\\n        width: 600px;\\n        margin: 5em auto;\\n        padding: 2em;\\n        background-color: #fdfdff;\\n        border-radius: 0.5em;\\n        box-shadow: 2px 3px 7px 2px rgba(0,0,0,0.02);\\n    }\\n    a:link, a:visited {\\n        color: #38488f;\\n        text-decoration: none;\\n    }\\n    @media (max-width: 700px) {\\n        div {\\n            margin: 0 auto;\\n            width: auto;\\n        }\\n    }\\n    \u003C\u002Fstyle>    \\n\u003C\u002Fhead>\\n\\n\u003Cbody>\\n\u003Cdiv>\\n    \u003Ch1>Example Domain\u003C\u002Fh1>\\n    \u003Cp>This domain is for use in illustrative examples in documents. You may use this\\n    domain in literature without prior coordination or asking for permission.\u003C\u002Fp>\\n    \u003Cp>\u003Ca href=\\\"https:\u002F\u002Fwww.iana.org\u002Fdomains\u002Fexample\\\">More information...\u003C\u002Fa>\u003C\u002Fp>\\n\u003C\u002Fdiv>\\n\u003C\u002Fbody>\\n\u003C\u002Fhtml>\\n\"\n  }\n}\n```\n\n*`-store-response`*\n----\n\nThe `-store-response` option allows for writing all crawled endpoint requests and responses to a text file. When this option is used, text files including the request and response will be written to the **katana_response** directory. If you would like to specify a custom directory, you can use the `-store-response-dir` option.\n\n```console\nkatana -u https:\u002F\u002Fexample.com -no-scope -store-response\n```\n\n```bash\n$ cat katana_response\u002Findex.txt\n\nkatana_response\u002Fexample.com\u002F327c3fda87ce286848a574982ddd0b7c7487f816.txt https:\u002F\u002Fexample.com (200 OK)\nkatana_response\u002Fwww.iana.org\u002Fbfc096e6dd93b993ca8918bf4c08fdc707a70723.txt http:\u002F\u002Fwww.iana.org\u002Fdomains\u002Freserved (200 OK)\n```\n\n**Note:**\n\n*`-store-response` option is not supported in `-headless` mode.*\n\n*`-list-output-fields`*\n----\n\nThe `-list-output-fields` or `-lof` flag displays all available fields that can be used in JSONL output format. This is useful for understanding what data is available when using custom output templates or when excluding specific fields.\n\n```console\nkatana -lof\n```\n\n*`-exclude-output-fields`*\n----\n\nThe `-exclude-output-fields` or `-eof` flag allows you to exclude specific fields from the JSONL output. This is useful for reducing output size or focusing on specific data by removing unwanted fields.\n\n```console\nkatana -u https:\u002F\u002Fexample.com -jsonl -eof raw,body\n```\n\nHere are additional CLI options related to output -\n\n```console\nkatana -h output\n\nOUTPUT:\n   -o, -output string                file to write output to\n   -sr, -store-response              store http requests\u002Fresponses\n   -srd, -store-response-dir string  store http requests\u002Fresponses to custom directory\n   -lof, -list-output-fields         list available fields for jsonl output format\n   -eof, -exclude-output-fields      exclude fields from jsonl output\n   -j, -json                         write output in JSON Lines format\n   -nc, -no-color                    disable output content coloring (ANSI escape codes)\n   -silent                           display output only\n   -v, -verbose                      display verbose output\n   -version                          display project version\n```\n\n## Katana as a library\n`katana` can be used as a library by creating an instance of the `Option` struct and populating it with the same options that would be specified via CLI. Using the options you can create `crawlerOptions` and so standard or hybrid `crawler`.\n`crawler.Crawl` method should be called to crawl the input.\n\n```go\npackage main\n\nimport (\n\t\"math\"\n\n\t\"github.com\u002Fprojectdiscovery\u002Fgologger\"\n\t\"github.com\u002Fprojectdiscovery\u002Fkatana\u002Fpkg\u002Fengine\u002Fstandard\"\n\t\"github.com\u002Fprojectdiscovery\u002Fkatana\u002Fpkg\u002Foutput\"\n\t\"github.com\u002Fprojectdiscovery\u002Fkatana\u002Fpkg\u002Ftypes\"\n)\n\nfunc main() {\n\toptions := &types.Options{\n\t\tMaxDepth:     3,             \u002F\u002F Maximum depth to crawl\n\t\tFieldScope:   \"rdn\",         \u002F\u002F Crawling Scope Field\n\t\tBodyReadSize: math.MaxInt,   \u002F\u002F Maximum response size to read\n\t\tTimeout:      10,            \u002F\u002F Timeout is the time to wait for request in seconds\n\t\tConcurrency:  10,            \u002F\u002F Concurrency is the number of concurrent crawling goroutines\n\t\tParallelism:  10,            \u002F\u002F Parallelism is the number of urls processing goroutines\n\t\tDelay:        0,             \u002F\u002F Delay is the delay between each crawl requests in seconds\n\t\tRateLimit:    150,           \u002F\u002F Maximum requests to send per second\n\t\tStrategy:     \"depth-first\", \u002F\u002F Visit strategy (depth-first, breadth-first)\n\t\tOnResult: func(result output.Result) { \u002F\u002F Callback function to execute for result\n\t\t\tgologger.Info().Msg(result.Request.URL)\n\t\t},\n\t}\n\tcrawlerOptions, err := types.NewCrawlerOptions(options)\n\tif err != nil {\n\t\tgologger.Fatal().Msg(err.Error())\n\t}\n\tdefer crawlerOptions.Close()\n\tcrawler, err := standard.New(crawlerOptions)\n\tif err != nil {\n\t\tgologger.Fatal().Msg(err.Error())\n\t}\n\tdefer crawler.Close()\n\tvar input = \"https:\u002F\u002Fwww.hackerone.com\"\n\terr = crawler.Crawl(input)\n\tif err != nil {\n\t\tgologger.Warning().Msgf(\"Could not crawl %s: %s\", input, err.Error())\n\t}\n}\n```\n\n## Reporting Issues & Feature Requests\n\nTo maintain issue tracking and improve triage efficiency:\n\n**All reports start as [GitHub Discussions](https:\u002F\u002Fgithub.com\u002Fprojectdiscovery\u002Fkatana\u002Fdiscussions)**\n\n- **Bug Reports** → [Start a Q&A Discussion](https:\u002F\u002Fgithub.com\u002Fprojectdiscovery\u002Fkatana\u002Fdiscussions\u002Fnew?category=q-a)\n- **Feature Requests** → [Start an Ideas Discussion](https:\u002F\u002Fgithub.com\u002Fprojectdiscovery\u002Fkatana\u002Fdiscussions\u002Fnew?category=ideas)  \n- **Questions** → [Start a Q&A Discussion](https:\u002F\u002Fgithub.com\u002Fprojectdiscovery\u002Fkatana\u002Fdiscussions\u002Fnew?category=q-a)\n\n**Why Discussions First?**\n- **Community can help** with quick questions and troubleshooting\n- **Better triage** - confirmed bugs\u002Ffeatures become tracked issues  \n- **Cleaner issue tracker** - focus on actionable items only\n\nMaintainers will convert discussions to issues when appropriate after proper review.\n\n--------\n\n\u003Cdiv align=\"center\">\n\nkatana is made with ❤️ by the [projectdiscovery](https:\u002F\u002Fprojectdiscovery.io) team and distributed under [MIT License](LICENSE.md).\n\n\n\u003Ca href=\"https:\u002F\u002Fdiscord.gg\u002Fprojectdiscovery\">\u003Cimg src=\"https:\u002F\u002Fraw.githubusercontent.com\u002Fprojectdiscovery\u002Fnuclei-burp-plugin\u002Fmain\u002Fstatic\u002Fjoin-discord.png\" width=\"300\" alt=\"Join Discord\">\u003C\u002Fa>\n\n\u003C\u002Fdiv>\n","Katana 是一个下一代的网页爬取和蜘蛛框架。它支持快速且完全可配置的网页爬取，提供标准模式和无头模式运行选项，并能够解析\u002F爬取JavaScript内容。此外，Katana还具备自定义自动表单填充、范围控制（预设字段\u002F正则表达式）、以及输出定制等功能，输入源支持标准输入、URL或列表，输出格式包括标准输出、文件及JSON。适用于需要高效灵活地抓取网页数据的各种场景，如网站结构分析、安全测试等。该项目使用Go语言开发，遵循MIT许可证，适合对网页爬虫有高要求的技术人员使用。",2,"2026-06-11 03:01:10","top_language"]