[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-83975":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":8,"htmlUrl":8,"language":9,"languages":8,"totalLinesOfCode":8,"stars":10,"forks":11,"watchers":10,"openIssues":11,"contributorsCount":11,"subscribersCount":11,"size":11,"stars1d":11,"stars7d":11,"stars30d":11,"stars90d":11,"forks30d":11,"starsTrendScore":11,"compositeScore":12,"rankGlobal":8,"rankLanguage":8,"license":13,"archived":14,"fork":14,"defaultBranch":15,"hasWiki":16,"hasPages":14,"topics":17,"createdAt":8,"pushedAt":8,"updatedAt":18,"readmeContent":19,"aiSummary":8,"trendingCount":11,"starSnapshotCount":11,"syncStatus":20,"lastSyncTime":21,"discoverSource":22},83975,"faasflow","xiaorenwu234\u002Ffaasflow","xiaorenwu234",null,"Python",59,0,37,"Apache License 2.0",false,"main",true,[],"2026-06-12 04:01:42","# FaaSFlow\n\n## Introduction\n\nFaaSFlow is a serverless workflow engine that enables efficient workflow execution in 2 ways: a worker-side workflow schedule pattern to reduce scheduling overhead, and an adaptive storage library to use local memory to transfer data between functions on the same node.\n\n*Our work has been accepted by ASPLOS' 22. The paper is [FaaSFlow: Enable Efficient Workflow Execution for Function-as-a-Service](https:\u002F\u002Fdl.acm.org\u002Fdoi\u002F10.1145\u002F3503222.3507717)*\n\n[![Security Status](https:\u002F\u002Fs.murphysec.com\u002Fbadge\u002Flzjzx1122\u002FFaaSFlow.svg)](https:\u002F\u002Fwww.murphysec.com\u002Fp\u002Flzjzx1122\u002FFaaSFlow)\n\n\n## Hardware Depedencies and Private IP Address\n\n1. In our experiment setup, we use aliyun ecs instance installed with Ubuntu 18.04 (ecs.g7.2xlarge, cores: 8, DRAM: 32GB) for each worker node, and a ecs.g6e.4xlarge(cores: 16, DRAM: 64GB) instance for database node installed with Ubuntu 18.04 and CouchDB.\n\n2. Please save the private IP address of the storage node as the **\u003Cmaster_ip>**, and save the private IP address of the other 7 worker nodes as the **\u003Cworker_ip>**. \n\n## About Config Setting\n\nThere are 2 places for config setting. `src\u002Fcontainer\u002Fcontainer_config.py` specifies CouchDB and Redis's address, you need to fill in correct ip so that application code can directly connect to database inside container environment. All other configurations are in `config\u002Fconfig.py`.\n\n## Installation and Software Dependencies\n\nClone our code `https:\u002F\u002Fgithub.com\u002Flzjzx1122\u002FFaaSFlow.git` and:\n\n1. Reset `worker_address` configuration with your \u003Cworker_ip>:8000 on `src\u002Fgrouping\u002Fnode_info.yaml`. It will specify your workers' addresses. The `scale_limit: 120` represents the maximum container numbers that can be deployed in each 32GB memory instance, and it does not need any change by default.\n\n2. Reset `COUCHDB_URL` as `http:\u002F\u002Fopenwhisk:openwhisk@\u003Cmaster_ip>:5984\u002F`  in `config\u002Fconfig.py`, `src\u002Fcontainer\u002Fcontainer_config.py`. It will specify the corresponding database storage you built previously.\n\n3. Then, clone the modified code into each node (8 nodes total).\n\n4. On the storage node: Run `scripts\u002Fdb_setup.bash`. It installs docker, CouchDB, some python packages, and build grouping results from 8 benchmarks. Then enable the max 4096 connections by adding the options to the configuration file in `\u002Fopt\u002Fcouchdb\u002Fetc\u002Flocal.ini`:\n```\n    [httpd]\n    server_options = [{backlog, 128}, {acceptor_pool_size, 16}, {max, 4096}]\n```\n\n5. On each worker node: Run `scripts\u002Fworker_setup.bash`. This install docker, Redis, some python packages, and build docker images from 8 benchmarks.\n\n## WorkerSP Start-up\n\nThe following operations help to run scripts under WorkerSP.\n\nFirstly, change the configuration by `DATA_MODE = optimized` and `CONTROL_MODE = WorkerSP` in both 7 worker nodes and storage node. Define the `GATEWAY_ADDR` as `\u003Cmaster_ip>:7000`. Then, enter `src\u002Fworkflow_manager` and start the engine proxy with the local  \u003Cworker_ip> on each worker node by the following \u003Cspan id=\"jump\">command\u003C\u002Fspan>: \n```\n    python3 proxy.py \u003Cworker_ip> 8000             (proxy start)\n```\nThen start the gateway on the storage node by the following command: \n```\n    python3 gateway.py \u003Cmaster_ip> 7000           (gateway start)\n``` \nIf you would like to run scripts under WorkerSP, you have finished all the operations and are allowed to send invocations **by `run.py` scripts for all WorkerSP-based performance tests**. Detailed scripts usage is introduced in [Run Experiment](#jumpexper).\n    \n**Note:** We recommend restarting the `proxy.py` on each worker node and the `gateway.py` on the master node whenever you start the `run.py` script, to avoid any potential bug.\n\n## MasterSP Start-up\n\nThe following operations help to run scripts under MasterSP. Firstly, change the configuration by `DATA_MODE = raw` and `CONTROL_MODE = MasterSP` in both 7 worker nodes and storage node. Then, restart the engine proxy on each worker node by the [proxy start](#jump) command, and restart the gateway on the storage node by the [gateway start](#jump) command.\n\nDefine the `MASTER_HOST` as `\u003Cmaster_ip>:8000`. Then,\nstart another proxy on the storage node as the virtual master node by the following command: \n```\n    python3 proxy.py \u003Cmaster_ip> 8000\n```\nIf you would like to run scripts under MasterSP, you have finished all the operations and allowed to send invocations **by `run.py` scripts for all Master-based performance test**. Detailed scripts usage is introduced in [Run Experiment](#jumpexper).\n\n## \u003Cspan id=\"jumpexper\">Run Experiment\u003C\u002Fspan>\n\nWe provide some test scripts under `test\u002Fasplos`.\n**\u003Cspan id=\"note\">Note:**\u003C\u002Fspan> We recommend to restart all `proxy.py` and `gateway.py` processes whenever you start the `run.py` script, to avoid any potential bug. The restart will clear all background function containers and reclaim the memory space. \n\n### Scheduler Scalability: the overhead of graph scheduler when scale-up total nodes of one workflow\n\nDirectly run on the storage node: \n```\n    python3 run.py\n```\n\n### Component Overhead: overhead of one workflow engine\n\nStart a proxy on any worker node (skip if you have already done in the above start-up) and get its pid. Then run it on any worker node:\n```\n    python3 run.py --pid=\u003Cpid>\n```\n    \n### Data Overhead: total time spend on data transmission\n\nMake the WorkerSP deployment, run it on the storage node: \n```\n    python3 run.py --datamode=optimized\n```\n\nThen make the MasterSP deployment, run it again with `--datamode=raw`.\n\n### End-to-End Latency: run one-by-one and run all-at-once\n\nFirstly, Make the WorkerSP deployment, run it on the storage node: \n```\n    python3 run.py --datamode=optimized --mode=single\n```\nThen terminate and restart all `proxy.py` and `gateway.py` (reasons in [here](#note)), run it again with `--datamode=optimized --mode=corun`.\n\nSecondly, make the MasterSP deployment, run it on the storage node:\n```\n    python3 run.py --datamode=raw --mode=single\n```\nThen terminate and restart all `proxy.py` and `gateway.py` , run it again with `--datamode=raw --mode=corun`.\n\n### Schedule Overhead: time spend on scheduling tasks\n\nMake the WorkerSP deployment, run it on the storage node: \n```\n    python3 run.py --controlmode=WorkerSP\n```\nThen make the MasterSP deployment, run it again with `--controlmode=MasterSP`.\n\n\n### 99%-ile latency on 50MB\u002Fs, 6 request\u002Fmin for all benchmark\n\n1. Download wondershaper from `https:\u002F\u002Fgithub.com\u002Fmagnific0\u002Fwondershaper` to the storage node.\n\n2. Make the WorkerSP deployment, and run the following commands in your storage node. These will clear the previous bandwidth setting and set the network bandwidth to 50MB\u002Fs:\n```\n    cd \u003Cyour_wondershaper_path>\u002Fwondershaper\n    .\u002Fwondershaper -a docker0 -c\n    .\u002Fwondershaper -a docker0 -u 409600 -d 409600\n```\n3. Then run the script on the storage node:\n```\n    python3 run.py --datamode=optimized\n```\n\n4. Make the MasterSP deployment, run it again with `--datamode=raw`\n\n\n### 99%-ile latency on 25MB\u002Fs-100MB\u002Fs, and with dfferent request\u002Fmin for benchmark genome and video\n\n1. Make the WorkerSP deployment, and run the following commands in your storage node. These will clear the previous bandwidth setting and set the network bandwidth to 25MB\u002Fs:\n```\n    cd \u003Cyour_wondershaper_path>\u002Fwondershaper\n    .\u002Fwondershaper -a docker0 -c\n    .\u002Fwondershaper -a docker0 -u 204800 -d 204800\n```\nand then run the following commands on the storage node. \n**Remember to restart all `proxy.py` and the `gateway.py` whenever you start the `run.py` script, to avoid any potential bug.**\n\n```\npython3 run.py --bandwidth=25 --datamode=optimized --workflow=genome    \n```\n\n\n2. clear the previous bandwidth setting and set the network bandwidth to 50MB\u002Fs:\n```\n    cd \u003Cyour_wondershaper_path>\u002Fwondershaper\n    .\u002Fwondershaper -a docker0 -c\n    .\u002Fwondershaper -a docker0 -u 409600 -d 409600\n```\nand then run the following commands on the storage node.\n```\n    python3 run.py --bandwidth=50 --datamode=optimized --workflow=genome    \n```\n3. Other configurations follow the same logic (`-u 614400 -d 614400` and `--bandwidth=75` corresponds to 75MB\u002Fs, `-u 819200 -d 819200` and `--bandwidth=100` corresponds to 100MB\u002Fs)\n\n4. Make the MasterSP deployment and review steps 1 and 2, however, with `--datamode=raw`. Then, the evaluation of benchmark follows the same logic with `--workflow=video`.\n\n## Cite\nWelcome to cite FaaSFlow by:\n```\n@inproceedings{10.1145\u002F3503222.3507717,\n    author = {Li, Zijun and Liu, Yushi and Guo, Linsong and Chen, Quan and Cheng, Jiagan and Zheng, Wenli and Guo, Minyi},\n    title = {FaaSFlow: Enable Efficient Workflow Execution for Function-as-a-Service},\n    year = {2022},\n    isbn = {9781450392051},\n    publisher = {Association for Computing Machinery},\n    address = {New York, NY, USA},\n    url = {https:\u002F\u002Fdoi.org\u002F10.1145\u002F3503222.3507717},\n    doi = {10.1145\u002F3503222.3507717},\n    booktitle = {Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems},\n    pages = {782–796},\n    numpages = {15},\n    keywords = {serverless workflows, master-worker, graph partition, FaaS},\n    location = {Lausanne, Switzerland},\n    series = {ASPLOS 2022}\n}\n```\n",2,"2026-06-11 04:11:57","CREATED_QUERY"]