[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-9854":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":25,"hasPages":23,"topics":26,"createdAt":10,"pushedAt":10,"updatedAt":30,"readmeContent":31,"aiSummary":32,"trendingCount":16,"starSnapshotCount":16,"syncStatus":33,"lastSyncTime":34,"discoverSource":35},9854,"reflexion","noahshinn\u002Freflexion","noahshinn","[NeurIPS 2023] Reflexion: Language Agents with Verbal Reinforcement Learning","",null,"Python",3178,307,31,20,0,1,9,37,6,29.47,"MIT License",false,"main",true,[27,28,29],"ai","artificial-intelligence","llm","2026-06-12 02:02:13","# [NeurIPS 2023] Reflexion: Language Agents with Verbal Reinforcement Learning\n\nThis repo holds the code, demos, and log files for [Reflexion: Language Agents with Verbal Reinforcement Learning](https:\u002F\u002Farxiv.org\u002Fabs\u002F2303.11366) by Noah Shinn, Federico Cassano, Edward Berman, Ashwin Gopinath, Karthik Narasimhan, Shunyu Yao. \n\n![Reflexion RL diagram](.\u002Ffigures\u002Freflexion_rl.png)\n\n![Reflexion tasks](.\u002Ffigures\u002Freflexion_tasks.png)\n\nWe have released the LeetcodeHardGym [here](https:\u002F\u002Fgithub.com\u002FGammaTauAI\u002Fleetcode-hard-gym)\n\n## To Run: reasoning (HotPotQA)\n\nWe have provided a set of notebooks to easily run, explore, and interact with the results of the reasoning experiments. Each experiment consists of a random sample of 100 questions from the HotPotQA distractor dataset. Each question in the sample is attempted by an agent with a specific type and reflexion strategy.\n\n### Setup\n\nTo get started:\n\n1. Clone this repo and move to the HotPotQA directory:\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fnoahshinn\u002Freflexion && cd .\u002Fhotpotqa_runs\n```\n\n2. Install the module dependencies into your environment:\n\n```bash\npip install -r requirements.txt\n```\n\n3. Set `OPENAI_API_KEY` environment variable to your OpenAI API key:\n\n```bash\nexport OPENAI_API_KEY=\u003Cyour key>\n```\n\n#### Agent Types\n\nAgent type is determined by the notebook you choose to run. The available agent types include:\n\n- `ReAct` - ReAct Agent\n\n- `CoT_context` - CoT Agent given supporting context about the question \n\n- `CoT_no_context` - CoT Agent given no supporting context about the question\n\nThe notebook for each agent type is located in the `.\u002Fhotpot_runs\u002Fnotebooks` directory.\n\n#### Reflexion Strategies\n\nEach notebook allows you to specify the reflexion strategy to be used by the agents. The available reflexion strategies, which are defined in an `Enum`, include:\n\n- `ReflexionStrategy.NONE` - The agent is not given any information about its last attempt. \n\n- `ReflexionStrategy.LAST_ATTEMPT` - The agent is given its reasoning trace from its last attempt on the question as context.\n\n- `ReflexionStrategy.REFLEXION` - The agent is given its self-reflection on the last attempt as context. \n\n- `ReflexionStrategy.LAST_ATTEMPT_AND_REFLEXION` -  The agent is given both its reasoning trace and self-reflection on the last attempt as context.\n\n### To Run: decision-making (AlfWorld)\n\nClone this repo and move to the AlfWorld directory\n\n```bash\ngit clone https:\u002F\u002Fgithub.com\u002Fnoahshinn\u002Freflexion && cd .\u002Falfworld_runs\n```\n\nSpecify the run parameters in `.\u002Frun_reflexion.sh`.\n`num_trials`: number of iterative learning steps\n`num_envs`: number of task-environment pairs per trial\n`run_name`: the name for this run\n`use_memory`: use persisting memory to store self-reflections (turn off to run a baseline run)\n`is_resume`: use logging directory to resume a previous run\n`resume_dir`: the logging directory from which to resume the previous run\n`start_trial_num`: if resume run, then the trial number of which to start\n\nRun the trial\n\n```bash\n.\u002Frun_reflexion.sh\n```\n\nThe logs will be sent to `.\u002Froot\u002F\u003Crun_name>`.\n\n### Another Note\n\nDue to the nature of these experiments, it may not be feasible for individual developers to rerun the results as GPT-4 has limited access and significant API charges. All runs from the paper and additional results are logged in `.\u002Falfworld_runs\u002Froot` for decision-making, `.\u002Fhotpotqa_runs\u002Froot` for reasoning, and `.\u002Fprogramming_runs\u002Froot` for programming\n\n### Other Notes\n\nCheck out the original implementation [here](https:\u002F\u002Fgithub.com\u002Fnoahshinn\u002Freflexion-draft)\n\nRead one of the original blog posts [here](https:\u002F\u002Fnanothoughts.substack.com\u002Fp\u002Freflecting-on-reflexion)\n\nCheck out an [Appl](https:\u002F\u002Fgithub.com\u002Fappl-team\u002Fappl) implementation [here](https:\u002F\u002Fgithub.com\u002Fappl-team\u002Freppl\u002Ftree\u002Fmain\u002Freflexion).\n\nCheck out an interesting type-prediction implementation here: [OpenTau](https:\u002F\u002Fgithub.com\u002FGammaTauAI\u002Fopentau)\n\nFor all questions, contact [noahrshinn@gmail.com](noahrshinn@gmail.com)\n\n### Cite\n\n```bibtex\n@misc{shinn2023reflexion,\n      title={Reflexion: Language Agents with Verbal Reinforcement Learning}, \n      author={Noah Shinn and Federico Cassano and Edward Berman and Ashwin Gopinath and Karthik Narasimhan and Shunyu Yao},\n      year={2023},\n      eprint={2303.11366},\n      archivePrefix={arXiv},\n      primaryClass={cs.AI}\n}\n```\n","Reflexion是一个基于语言代理的口头强化学习项目，旨在通过自然语言处理技术提高AI在复杂任务中的表现。其核心功能是通过不同的反思策略（如仅提供上一次尝试的信息、提供自我反思等）来优化代理的学习过程，主要使用Python编写，并依赖于OpenAI API进行实验。该项目特别适合需要增强决策能力和推理能力的人工智能应用场景，例如解决复杂的问答问题或是在虚拟环境中完成特定任务。通过调整代理类型和反思策略，用户可以探索不同配置下AI性能的变化，为研究提供了丰富的实验资源。",2,"2026-06-11 03:25:02","top_topic"]