[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-2389":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":17,"stars7d":18,"stars30d":19,"stars90d":16,"forks30d":16,"starsTrendScore":20,"compositeScore":21,"rankGlobal":10,"rankLanguage":10,"license":22,"archived":23,"fork":23,"defaultBranch":24,"hasWiki":23,"hasPages":25,"topics":26,"createdAt":10,"pushedAt":10,"updatedAt":47,"readmeContent":48,"aiSummary":49,"trendingCount":16,"starSnapshotCount":16,"syncStatus":50,"lastSyncTime":51,"discoverSource":52},2389,"serve","jina-ai\u002Fserve","jina-ai","☁️ Build multimodal AI applications with cloud-native stack","https:\u002F\u002Fjina.ai\u002Fserve",null,"Python",21864,2236,211,1,0,3,4,8,9,45,"Apache License 2.0",false,"master",true,[27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46],"cloud-native","cncf","deep-learning","docker","fastapi","framework","generative-ai","grpc","jaeger","kubernetes","llmops","machine-learning","microservice","mlops","multimodal","neural-search","opentelemetry","orchestration","pipeline","prometheus","2026-06-12 02:00:40","# Jina-Serve\n\u003Ca href=\"https:\u002F\u002Fpypi.org\u002Fproject\u002Fjina\u002F\">\u003Cimg alt=\"PyPI\" src=\"https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fv\u002Fjina?label=Release&style=flat-square\">\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fdiscord.jina.ai\">\u003Cimg src=\"https:\u002F\u002Fimg.shields.io\u002Fdiscord\u002F1106542220112302130?logo=discord&logoColor=white&style=flat-square\">\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fpypistats.org\u002Fpackages\u002Fjina\">\u003Cimg alt=\"PyPI - Downloads from official pypistats\" src=\"https:\u002F\u002Fimg.shields.io\u002Fpypi\u002Fdm\u002Fjina?style=flat-square\">\u003C\u002Fa>\n\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fjina-ai\u002Fjina\u002Factions\u002Fworkflows\u002Fcd.yml\">\u003Cimg alt=\"Github CD status\" src=\"https:\u002F\u002Fgithub.com\u002Fjina-ai\u002Fjina\u002Factions\u002Fworkflows\u002Fcd.yml\u002Fbadge.svg\">\u003C\u002Fa>\n\nJina-serve is a framework for building and deploying AI services that communicate via gRPC, HTTP and WebSockets. Scale your services from local development to production while focusing on your core logic.\n\n## Key Features\n\n- Native support for all major ML frameworks and data types\n- High-performance service design with scaling, streaming, and dynamic batching\n- LLM serving with streaming output\n- Built-in Docker integration and Executor Hub\n- One-click deployment to Jina AI Cloud\n- Enterprise-ready with Kubernetes and Docker Compose support\n\n\u003Cdetails>\n\u003Csummary>\u003Cstrong>Comparison with FastAPI\u003C\u002Fstrong>\u003C\u002Fsummary>\n\nKey advantages over FastAPI:\n\n- DocArray-based data handling with native gRPC support\n- Built-in containerization and service orchestration\n- Seamless scaling of microservices\n- One-command cloud deployment\n\u003C\u002Fdetails>\n\n## Install \n\n```bash\npip install jina\n```\n\nSee guides for [Apple Silicon](https:\u002F\u002Fjina.ai\u002Fserve\u002Fget-started\u002Finstall\u002Fapple-silicon-m1-m2\u002F) and [Windows](https:\u002F\u002Fjina.ai\u002Fserve\u002Fget-started\u002Finstall\u002Fwindows\u002F).\n\n## Core Concepts\n\nThree main layers:\n- **Data**: BaseDoc and DocList for input\u002Foutput\n- **Serving**: Executors process Documents, Gateway connects services\n- **Orchestration**: Deployments serve Executors, Flows create pipelines\n\n## Build AI Services\n\nLet's create a gRPC-based AI service using StableLM:\n\n```python\nfrom jina import Executor, requests\nfrom docarray import DocList, BaseDoc\nfrom transformers import pipeline\n\n\nclass Prompt(BaseDoc):\n    text: str\n\n\nclass Generation(BaseDoc):\n    prompt: str\n    text: str\n\n\nclass StableLM(Executor):\n    def __init__(self, **kwargs):\n        super().__init__(**kwargs)\n        self.generator = pipeline(\n            'text-generation', model='stabilityai\u002Fstablelm-base-alpha-3b'\n        )\n\n    @requests\n    def generate(self, docs: DocList[Prompt], **kwargs) -> DocList[Generation]:\n        generations = DocList[Generation]()\n        prompts = docs.text\n        llm_outputs = self.generator(prompts)\n        for prompt, output in zip(prompts, llm_outputs):\n            generations.append(Generation(prompt=prompt, text=output))\n        return generations\n```\n\nDeploy with Python or YAML:\n\n```python\nfrom jina import Deployment\nfrom executor import StableLM\n\ndep = Deployment(uses=StableLM, timeout_ready=-1, port=12345)\n\nwith dep:\n    dep.block()\n```\n\n```yaml\njtype: Deployment\nwith:\n uses: StableLM\n py_modules:\n   - executor.py\n timeout_ready: -1\n port: 12345\n```\n\nUse the client:\n\n```python\nfrom jina import Client\nfrom docarray import DocList\nfrom executor import Prompt, Generation\n\nprompt = Prompt(text='suggest an interesting image generation prompt')\nclient = Client(port=12345)\nresponse = client.post('\u002F', inputs=[prompt], return_type=DocList[Generation])\n```\n\n## Build Pipelines\n\nChain services into a Flow:\n\n```python\nfrom jina import Flow\n\nflow = Flow(port=12345).add(uses=StableLM).add(uses=TextToImage)\n\nwith flow:\n    flow.block()\n```\n\n## Scaling and Deployment\n\n### Local Scaling\n\nBoost throughput with built-in features:\n- Replicas for parallel processing\n- Shards for data partitioning\n- Dynamic batching for efficient model inference\n\nExample scaling a Stable Diffusion deployment:\n\n```yaml\njtype: Deployment\nwith:\n uses: TextToImage\n timeout_ready: -1\n py_modules:\n   - text_to_image.py\n env:\n  CUDA_VISIBLE_DEVICES: RR\n replicas: 2\n uses_dynamic_batching:\n   \u002Fdefault:\n     preferred_batch_size: 10\n     timeout: 200\n```\n\n### Cloud Deployment\n\n#### Containerize Services\n\n1. Structure your Executor:\n```\nTextToImage\u002F\n├── executor.py\n├── config.yml\n├── requirements.txt\n```\n\n2. Configure:\n```yaml\n# config.yml\njtype: TextToImage\npy_modules:\n - executor.py\nmetas:\n name: TextToImage\n description: Text to Image generation Executor\n```\n\n3. Push to Hub:\n```bash\njina hub push TextToImage\n```\n\n#### Deploy to Kubernetes\n```bash\njina export kubernetes flow.yml .\u002Fmy-k8s\nkubectl apply -R -f my-k8s\n```\n\n#### Use Docker Compose\n```bash\njina export docker-compose flow.yml docker-compose.yml\ndocker-compose up\n```\n\n#### JCloud Deployment\n\nDeploy with a single command:\n```bash\njina cloud deploy jcloud-flow.yml\n```\n\n## LLM Streaming\n\nEnable token-by-token streaming for responsive LLM applications:\n\n1. Define schemas:\n```python\nfrom docarray import BaseDoc\n\n\nclass PromptDocument(BaseDoc):\n    prompt: str\n    max_tokens: int\n\n\nclass ModelOutputDocument(BaseDoc):\n    token_id: int\n    generated_text: str\n```\n\n2. Initialize service:\n```python\nfrom transformers import GPT2Tokenizer, GPT2LMHeadModel\n\n\nclass TokenStreamingExecutor(Executor):\n    def __init__(self, **kwargs):\n        super().__init__(**kwargs)\n        self.model = GPT2LMHeadModel.from_pretrained('gpt2')\n```\n\n3. Implement streaming:\n```python\n@requests(on='\u002Fstream')\nasync def task(self, doc: PromptDocument, **kwargs) -> ModelOutputDocument:\n    input = tokenizer(doc.prompt, return_tensors='pt')\n    input_len = input['input_ids'].shape[1]\n    for _ in range(doc.max_tokens):\n        output = self.model.generate(**input, max_new_tokens=1)\n        if output[0][-1] == tokenizer.eos_token_id:\n            break\n        yield ModelOutputDocument(\n            token_id=output[0][-1],\n            generated_text=tokenizer.decode(\n                output[0][input_len:], skip_special_tokens=True\n            ),\n        )\n        input = {\n            'input_ids': output,\n            'attention_mask': torch.ones(1, len(output[0])),\n        }\n```\n\n4. Serve and use:\n```python\n# Server\nwith Deployment(uses=TokenStreamingExecutor, port=12345, protocol='grpc') as dep:\n    dep.block()\n\n\n# Client\nasync def main():\n    client = Client(port=12345, protocol='grpc', asyncio=True)\n    async for doc in client.stream_doc(\n        on='\u002Fstream',\n        inputs=PromptDocument(prompt='what is the capital of France ?', max_tokens=10),\n        return_type=ModelOutputDocument,\n    ):\n        print(doc.generated_text)\n```\n\n## Support\n\nJina-serve is backed by [Jina AI](https:\u002F\u002Fjina.ai) and licensed under [Apache-2.0](.\u002FLICENSE).\n","Jina-serve 是一个用于构建和部署基于 gRPC、HTTP 和 WebSocket 通信的 AI 服务框架。它支持主流的机器学习框架和数据类型，具有高性能的服务设计，包括动态批处理、流式传输以及大规模扩展能力。此外，Jina-serve 提供了内置的 Docker 集成与 Executor Hub，支持一键部署到 Jina AI Cloud，并且具备企业级 Kubernetes 和 Docker Compose 支持。适用于需要从本地开发无缝过渡到生产环境的多模态AI应用开发场景，特别是在注重核心业务逻辑实现而希望简化部署流程的情况下。",2,"2026-06-11 02:49:44","top_language"]