[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-2340":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":18,"stars90d":16,"forks30d":16,"starsTrendScore":19,"compositeScore":20,"rankGlobal":10,"rankLanguage":10,"license":21,"archived":22,"fork":22,"defaultBranch":23,"hasWiki":24,"hasPages":24,"topics":25,"createdAt":10,"pushedAt":10,"updatedAt":32,"readmeContent":33,"aiSummary":34,"trendingCount":16,"starSnapshotCount":16,"syncStatus":17,"lastSyncTime":35,"discoverSource":36},2340,"NLP-progress","sebastianruder\u002FNLP-progress","sebastianruder","Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.","https:\u002F\u002Fnlpprogress.com\u002F",null,"Python",22959,3598,1250,36,0,2,5,6,77,"MIT License",false,"master",true,[26,27,28,29,30,31],"dialogue","machine-learning","machine-translation","named-entity-recognition","natural-language-processing","nlp-tasks","2026-06-12 04:00:14","# Tracking Progress in Natural Language Processing\n\n## Table of contents\n\n### English\n\n- [Automatic speech recognition](english\u002Fautomatic_speech_recognition.md)\n- [CCG](english\u002Fccg.md)\n- [Common sense](english\u002Fcommon_sense.md)\n- [Constituency parsing](english\u002Fconstituency_parsing.md)\n- [Coreference resolution](english\u002Fcoreference_resolution.md)\n- [Data-to-Text Generation](english\u002Fdata_to_text_generation.md)\n- [Dependency parsing](english\u002Fdependency_parsing.md)\n- [Dialogue](english\u002Fdialogue.md)\n- [Domain adaptation](english\u002Fdomain_adaptation.md)\n- [Entity linking](english\u002Fentity_linking.md)\n- [Grammatical error correction](english\u002Fgrammatical_error_correction.md)\n- [Information extraction](english\u002Finformation_extraction.md)\n- [Intent Detection and Slot Filling](english\u002Fintent_detection_slot_filling.md) \n- [Keyphrase Extraction and Generation](english\u002Fkeyphrase_extraction_generation.md)\n- [Language modeling](english\u002Flanguage_modeling.md)\n- [Lexical normalization](english\u002Flexical_normalization.md)\n- [Machine translation](english\u002Fmachine_translation.md)\n- [Missing elements](english\u002Fmissing_elements.md)\n- [Multi-task learning](english\u002Fmulti-task_learning.md)\n- [Multi-modal](english\u002Fmultimodal.md)\n- [Named entity recognition](english\u002Fnamed_entity_recognition.md)\n- [Natural language inference](english\u002Fnatural_language_inference.md)\n- [Part-of-speech tagging](english\u002Fpart-of-speech_tagging.md)\n- [Paraphrase Generation](english\u002Fparaphrase-generation.md)\n- [Question answering](english\u002Fquestion_answering.md)\n- [Relation prediction](english\u002Frelation_prediction.md)\n- [Relationship extraction](english\u002Frelationship_extraction.md)\n- [Semantic textual similarity](english\u002Fsemantic_textual_similarity.md)\n- [Semantic parsing](english\u002Fsemantic_parsing.md)\n- [Semantic role labeling](english\u002Fsemantic_role_labeling.md)\n- [Sentiment analysis](english\u002Fsentiment_analysis.md)\n- [Shallow syntax](english\u002Fshallow_syntax.md)\n- [Simplification](english\u002Fsimplification.md)\n- [Stance detection](english\u002Fstance_detection.md)\n- [Summarization](english\u002Fsummarization.md)\n- [Taxonomy learning](english\u002Ftaxonomy_learning.md)\n- [Temporal processing](english\u002Ftemporal_processing.md)\n- [Text classification](english\u002Ftext_classification.md)\n- [Word sense disambiguation](english\u002Fword_sense_disambiguation.md)\n\n### Vietnamese\n\n- [Dependency parsing](vietnamese\u002Fvietnamese.md#dependency-parsing)\n- [Intent detection and Slot filling](vietnamese\u002Fvietnamese.md#intent-detection-and-slot-filling)\n- [Machine translation](vietnamese\u002Fvietnamese.md#machine-translation)\n- [Named entity recognition](vietnamese\u002Fvietnamese.md#named-entity-recognition)\n- [Part-of-speech tagging](vietnamese\u002Fvietnamese.md#part-of-speech-tagging)\n- [Semantic parsing](vietnamese\u002Fvietnamese.md#semantic-parsing)\n- [Word segmentation](vietnamese\u002Fvietnamese.md#word-segmentation)\n\n### Hindi\n\n- [Chunking](hindi\u002Fhindi.md#chunking)\n- [Part-of-speech tagging](hindi\u002Fhindi.md#part-of-speech-tagging)\n- [Machine Translation](hindi\u002Fhindi.md#machine-translation)\n\n### Chinese\n\n- [Entity linking](chinese\u002Fchinese.md#entity-linking)\n- [Chinese word segmentation](chinese\u002Fchinese_word_segmentation.md)\n- [Question answering](chinese\u002Fquestion_answering.md)\n\nFor more tasks, datasets and results in Chinese, check out the [Chinese NLP](https:\u002F\u002Fchinesenlp.xyz\u002F#\u002F) website.\n\n### French\n\n- [Question answering](french\u002Fquestion_answering.md)\n- [Summarization](french\u002Fsummarization.md)\n\n### Russian\n\n- [Question answering](russian\u002Fquestion_answering.md)\n- [Sentiment Analysis](russian\u002Fsentiment-analysis.md)\n- [Summarization](russian\u002Fsummarization.md)\n\n### Spanish\n\n- [Named Entity Recognition](spanish\u002Fnamed_entity_recognition.md)\n- [Entity linking](spanish\u002Fentity_linking.md#entity-linking)\n- [Summarization](spanish\u002Fsummarization.md)\n\n### Portuguese\n\n- [Question Answering](portuguese\u002Fquestion_answering.md)\n\n### Korean\n\n- [Question Answering](korean\u002Fquestion_answering.md)\n\n### Nepali\n\n- [Machine Translation](nepali\u002Fnepali.md#machine-translation)\n\n### Bengali\n- [Part-of-speech Tagging](bengali\u002Fpart_of_speech_tagging.md)\n- [Emotion Detection](bengali\u002Femotion_detection.md)\n- [Sentiment Analysis](bengali\u002Fsentiment_analysis.md)\n\n### Persian\n- [Named entity recognition](persian\u002Fnamed_entity_recognition.md)\n- [Natural language inference](persian\u002Fnatural_language_inference.md)\n- [Summarization](persian\u002Fsummarization.md)\n\n### Turkish\n\n- [Summarization](turkish\u002Fsummarization.md)\n\n### German\n\n- [Question Answering](german\u002Fquestion_answering.md)\n- [Summarization](german\u002Fsummarization.md)\n\n### Arabic\n- [Language modeling](arabic\u002Flanguage_modeling.md)\n\n\nThis document aims to track the progress in Natural Language Processing (NLP) and give an overview\nof the state-of-the-art (SOTA) across the most common NLP tasks and their corresponding datasets.\n\nIt aims to cover both traditional and core NLP tasks such as dependency parsing and part-of-speech tagging\nas well as more recent ones such as reading comprehension and natural language inference. The main objective\nis to provide the reader with a quick overview of benchmark datasets and the state-of-the-art for their\ntask of interest, which serves as a stepping stone for further research. To this end, if there is a \nplace where results for a task are already published and regularly maintained, such as a public leaderboard,\nthe reader will be pointed there.\n\nIf you want to find this document again in the future, just go to [`nlpprogress.com`](https:\u002F\u002Fnlpprogress.com\u002F)\nor [`nlpsota.com`](http:\u002F\u002Fnlpsota.com\u002F) in your browser.\n\n### Contributing\n\n#### Guidelines\n\n**Results** &nbsp; Results reported in published papers are preferred; an exception may be made for influential preprints.\n\n**Datasets** &nbsp; Datasets should have been used for evaluation in at least one published paper besides \nthe one that introduced the dataset.\n\n**Code** &nbsp; We recommend to add a link to an implementation \nif available. You can add a `Code` column (see below) to the table if it does not exist.\nIn the `Code` column, indicate an official implementation with [Official](http:\u002F\u002Flink_to_implementation).\nIf an unofficial implementation is available, use [Link](http:\u002F\u002Flink_to_implementation) (see below).\nIf no implementation is available, you can leave the cell empty.\n\n#### Adding a new result\n\nIf you would like to add a new result, you can just click on the small edit button in the top-right\ncorner of the file for the respective task (see below).\n\n![Click on the edit button to add a file](img\u002Fedit_file.png)\n\nThis allows you to edit the file in Markdown. Simply add a row to the corresponding table in the\nsame format. Make sure that the table stays sorted (with the best result on top). \nAfter you've made your change, make sure that the table still looks ok by clicking on the\n\"Preview changes\" tab at the top of the page. If everything looks good, go to the bottom of the page,\nwhere you see the below form. \n\n![Fill out the file change information](img\u002Fpropose_file_change.png)\n\nAdd a name for your proposed change, an optional description, indicate that you would like to\n\"Create a new branch for this commit and start a pull request\", and click on \"Propose file change\".\n\n#### Adding a new dataset or task\n\nFor adding a new dataset or task, you can also follow the steps above. Alternatively, you can fork the repository.\nIn both cases, follow the steps below:\n\n1. If your task is completely new, create a new file and link to it in the table of contents above.\n2. If not, add your task or dataset to the respective section of the corresponding file (in alphabetical order).\n3. Briefly describe the dataset\u002Ftask and include relevant references. \n4. Describe the evaluation setting and evaluation metric.\n5. Show how an annotated example of the dataset\u002Ftask looks like.\n6. Add a download link if available.\n7. Copy the below table and fill in at least two results (including the state-of-the-art)\n  for your dataset\u002Ftask (change Score to the metric of your dataset). If your dataset\u002Ftask\n  has multiple metrics, add them to the right of `Score`.\n1. Submit your change as a pull request.\n  \n| Model           | Score  |  Paper \u002F Source | Code |\n| ------------- | :-----:| --- | --- |\n|  |  |  | |\n\n\n### Wish list\n\nThese are tasks and datasets that are still missing:\n\n- Bilingual dictionary induction\n- Discourse parsing\n- Keyphrase extraction\n- Knowledge base population (KBP)\n- More dialogue tasks\n- Semi-supervised learning\n- Frame-semantic parsing (FrameNet full-sentence analysis)\n\n### Exporting into a structured format\n\nYou can extract all the data into a structured, machine-readable JSON format with parsed tasks, descriptions and SOTA tables. \n\nThe instructions are in [structured\u002FREADME.md](structured\u002FREADME.md).\n\n### Instructions for building the site locally\n\nInstructions for building the website locally using Jekyll can be found [here](jekyll_instructions.md).\n\n\n","该项目旨在跟踪自然语言处理（NLP）领域的进展，包括常见NLP任务的数据集和当前最先进的技术。核心功能涵盖了从自动语音识别、机器翻译到情感分析等广泛的任务，并且详细记录了每个任务的最新研究成果及所用数据集。项目采用Python编写，支持多种语言如英语、越南语和印地语的不同NLP任务。适合研究人员、开发者以及对NLP领域感兴趣的任何人使用，作为了解行业动态和技术前沿的重要参考资源。","2026-06-11 02:49:32","top_language"]