Merge remote-tracking branch 'origin/main'

pull/21063/head
hsiong 11 months ago
commit 7f0970d87c

@ -84,10 +84,8 @@ jobs:
elasticsearch
oceanbase
- name: Check VDB Ready (TiDB, Oceanbase)
run: |
uv run --project api python api/tests/integration_tests/vdb/tidb_vector/check_tiflash_ready.py
uv run --project api python api/tests/integration_tests/vdb/oceanbase/check_oceanbase_ready.py
- name: Check VDB Ready (TiDB)
run: uv run --project api python api/tests/integration_tests/vdb/tidb_vector/check_tiflash_ready.py
- name: Test Vector Stores
run: uv run --project api bash dev/pytest/pytest_vdb.sh

4
.gitignore vendored

@ -179,6 +179,7 @@ docker/volumes/pgvecto_rs/data/*
docker/volumes/couchbase/*
docker/volumes/oceanbase/*
docker/volumes/plugin_daemon/*
docker/volumes/matrixone/*
!docker/volumes/oceanbase/init.d
docker/nginx/conf.d/default.conf
@ -210,3 +211,6 @@ mise.toml
# Next.js build output
.next/
# AI Assistant
.roo/

@ -226,6 +226,11 @@ Deploy Dify to AWS with [CDK](https://aws.amazon.com/cdk/)
- [AWS CDK by @KevinZhao](https://github.com/aws-samples/solution-for-deploying-dify-on-aws)
#### Using Alibaba Cloud Computing Nest
Quickly deploy Dify to Alibaba cloud with [Alibaba Cloud Computing Nest](https://computenest.console.aliyun.com/service/instance/create/default?type=user&ServiceName=Dify%E7%A4%BE%E5%8C%BA%E7%89%88)
## Contributing
For those who'd like to contribute code, see our [Contribution Guide](https://github.com/langgenius/dify/blob/main/CONTRIBUTING.md).

@ -209,6 +209,9 @@ docker compose up -d
- [AWS CDK بواسطة @KevinZhao](https://github.com/aws-samples/solution-for-deploying-dify-on-aws)
#### استخدام Alibaba Cloud للنشر
[بسرعة نشر Dify إلى سحابة علي بابا مع عش الحوسبة السحابية علي بابا](https://computenest.console.aliyun.com/service/instance/create/default?type=user&ServiceName=Dify%E7%A4%BE%E5%8C%BA%E7%89%88)
## المساهمة
لأولئك الذين يرغبون في المساهمة، انظر إلى [دليل المساهمة](https://github.com/langgenius/dify/blob/main/CONTRIBUTING.md) لدينا.

@ -225,6 +225,11 @@ GitHub-এ ডিফাইকে স্টার দিয়ে রাখুন
- [AWS CDK by @KevinZhao](https://github.com/aws-samples/solution-for-deploying-dify-on-aws)
#### Alibaba Cloud ব্যবহার করে ডিপ্লয়
[Alibaba Cloud Computing Nest](https://computenest.console.aliyun.com/service/instance/create/default?type=user&ServiceName=Dify%E7%A4%BE%E5%8C%BA%E7%89%88)
## Contributing
যারা কোড অবদান রাখতে চান, তাদের জন্য আমাদের [অবদান নির্দেশিকা] দেখুন (https://github.com/langgenius/dify/blob/main/CONTRIBUTING.md)।

@ -221,6 +221,11 @@ docker compose up -d
##### AWS
- [AWS CDK by @KevinZhao](https://github.com/aws-samples/solution-for-deploying-dify-on-aws)
#### 使用 阿里云计算巢 部署
使用 [阿里云计算巢](https://computenest.console.aliyun.com/service/instance/create/default?type=user&ServiceName=Dify%E7%A4%BE%E5%8C%BA%E7%89%88) 将 Dify 一键部署到 阿里云
## Star History
[![Star History Chart](https://api.star-history.com/svg?repos=langgenius/dify&type=Date)](https://star-history.com/#langgenius/dify&Date)

@ -221,6 +221,11 @@ Bereitstellung von Dify auf AWS mit [CDK](https://aws.amazon.com/cdk/)
##### AWS
- [AWS CDK by @KevinZhao](https://github.com/aws-samples/solution-for-deploying-dify-on-aws)
#### Alibaba Cloud
[Alibaba Cloud Computing Nest](https://computenest.console.aliyun.com/service/instance/create/default?type=user&ServiceName=Dify%E7%A4%BE%E5%8C%BA%E7%89%88)
## Contributing
Falls Sie Code beitragen möchten, lesen Sie bitte unseren [Contribution Guide](https://github.com/langgenius/dify/blob/main/CONTRIBUTING.md). Gleichzeitig bitten wir Sie, Dify zu unterstützen, indem Sie es in den sozialen Medien teilen und auf Veranstaltungen und Konferenzen präsentieren.

@ -221,6 +221,10 @@ Despliegue Dify en AWS usando [CDK](https://aws.amazon.com/cdk/)
##### AWS
- [AWS CDK por @KevinZhao](https://github.com/aws-samples/solution-for-deploying-dify-on-aws)
#### Alibaba Cloud
[Alibaba Cloud Computing Nest](https://computenest.console.aliyun.com/service/instance/create/default?type=user&ServiceName=Dify%E7%A4%BE%E5%8C%BA%E7%89%88)
## Contribuir
Para aquellos que deseen contribuir con código, consulten nuestra [Guía de contribución](https://github.com/langgenius/dify/blob/main/CONTRIBUTING.md).

@ -219,6 +219,11 @@ Déployez Dify sur AWS en utilisant [CDK](https://aws.amazon.com/cdk/)
##### AWS
- [AWS CDK par @KevinZhao](https://github.com/aws-samples/solution-for-deploying-dify-on-aws)
#### Alibaba Cloud
[Alibaba Cloud Computing Nest](https://computenest.console.aliyun.com/service/instance/create/default?type=user&ServiceName=Dify%E7%A4%BE%E5%8C%BA%E7%89%88)
## Contribuer
Pour ceux qui souhaitent contribuer du code, consultez notre [Guide de contribution](https://github.com/langgenius/dify/blob/main/CONTRIBUTING.md).

@ -220,6 +220,10 @@ docker compose up -d
##### AWS
- [@KevinZhaoによるAWS CDK](https://github.com/aws-samples/solution-for-deploying-dify-on-aws)
#### Alibaba Cloud
[Alibaba Cloud Computing Nest](https://computenest.console.aliyun.com/service/instance/create/default?type=user&ServiceName=Dify%E7%A4%BE%E5%8C%BA%E7%89%88)
## 貢献
コードに貢献したい方は、[Contribution Guide](https://github.com/langgenius/dify/blob/main/CONTRIBUTING.md)を参照してください。

@ -219,6 +219,11 @@ wa'logh nIqHom neH ghun deployment toy'wI' [CDK](https://aws.amazon.com/cdk/) lo
##### AWS
- [AWS CDK qachlot @KevinZhao](https://github.com/aws-samples/solution-for-deploying-dify-on-aws)
#### Alibaba Cloud
[Alibaba Cloud Computing Nest](https://computenest.console.aliyun.com/service/instance/create/default?type=user&ServiceName=Dify%E7%A4%BE%E5%8C%BA%E7%89%88)
## Contributing
For those who'd like to contribute code, see our [Contribution Guide](https://github.com/langgenius/dify/blob/main/CONTRIBUTING.md).

@ -213,6 +213,11 @@ Dify를 Kubernetes에 배포하고 프리미엄 스케일링 설정을 구성했
##### AWS
- [KevinZhao의 AWS CDK](https://github.com/aws-samples/solution-for-deploying-dify-on-aws)
#### Alibaba Cloud
[Alibaba Cloud Computing Nest](https://computenest.console.aliyun.com/service/instance/create/default?type=user&ServiceName=Dify%E7%A4%BE%E5%8C%BA%E7%89%88)
## 기여
코드에 기여하고 싶은 분들은 [기여 가이드](https://github.com/langgenius/dify/blob/main/CONTRIBUTING.md)를 참조하세요.

@ -218,6 +218,11 @@ Implante o Dify na AWS usando [CDK](https://aws.amazon.com/cdk/)
##### AWS
- [AWS CDK por @KevinZhao](https://github.com/aws-samples/solution-for-deploying-dify-on-aws)
#### Alibaba Cloud
[Alibaba Cloud Computing Nest](https://computenest.console.aliyun.com/service/instance/create/default?type=user&ServiceName=Dify%E7%A4%BE%E5%8C%BA%E7%89%88)
## Contribuindo
Para aqueles que desejam contribuir com código, veja nosso [Guia de Contribuição](https://github.com/langgenius/dify/blob/main/CONTRIBUTING.md).

@ -219,6 +219,11 @@ Uvedite Dify v AWS z uporabo [CDK](https://aws.amazon.com/cdk/)
##### AWS
- [AWS CDK by @KevinZhao](https://github.com/aws-samples/solution-for-deploying-dify-on-aws)
#### Alibaba Cloud
[Alibaba Cloud Computing Nest](https://computenest.console.aliyun.com/service/instance/create/default?type=user&ServiceName=Dify%E7%A4%BE%E5%8C%BA%E7%89%88)
## Prispevam
Za tiste, ki bi radi prispevali kodo, si oglejte naš vodnik za prispevke . Hkrati vas prosimo, da podprete Dify tako, da ga delite na družbenih medijih ter na dogodkih in konferencah.

@ -212,6 +212,11 @@ Dify'ı bulut platformuna tek tıklamayla dağıtın [terraform](https://www.ter
##### AWS
- [AWS CDK tarafından @KevinZhao](https://github.com/aws-samples/solution-for-deploying-dify-on-aws)
#### Alibaba Cloud
[Alibaba Cloud Computing Nest](https://computenest.console.aliyun.com/service/instance/create/default?type=user&ServiceName=Dify%E7%A4%BE%E5%8C%BA%E7%89%88)
## Katkıda Bulunma
Kod katkısında bulunmak isteyenler için [Katkı Kılavuzumuza](https://github.com/langgenius/dify/blob/main/CONTRIBUTING.md) bakabilirsiniz.

@ -224,6 +224,11 @@ Dify 的所有功能都提供相應的 API因此您可以輕鬆地將 Dify
- [由 @KevinZhao 提供的 AWS CDK](https://github.com/aws-samples/solution-for-deploying-dify-on-aws)
#### 使用 阿里云计算巢進行部署
[阿里云](https://computenest.console.aliyun.com/service/instance/create/default?type=user&ServiceName=Dify%E7%A4%BE%E5%8C%BA%E7%89%88)
## 貢獻
對於想要貢獻程式碼的開發者,請參閱我們的[貢獻指南](https://github.com/langgenius/dify/blob/main/CONTRIBUTING.md)。

@ -214,6 +214,12 @@ Triển khai Dify trên AWS bằng [CDK](https://aws.amazon.com/cdk/)
##### AWS
- [AWS CDK bởi @KevinZhao](https://github.com/aws-samples/solution-for-deploying-dify-on-aws)
#### Alibaba Cloud
[Alibaba Cloud Computing Nest](https://computenest.console.aliyun.com/service/instance/create/default?type=user&ServiceName=Dify%E7%A4%BE%E5%8C%BA%E7%89%88)
## Đóng góp
Đối với những người muốn đóng góp mã, xem [Hướng dẫn Đóng góp](https://github.com/langgenius/dify/blob/main/CONTRIBUTING.md) của chúng tôi.

@ -137,7 +137,7 @@ WEB_API_CORS_ALLOW_ORIGINS=http://127.0.0.1:3000,*
CONSOLE_CORS_ALLOW_ORIGINS=http://127.0.0.1:3000,*
# Vector database configuration
# support: weaviate, qdrant, milvus, myscale, relyt, pgvecto_rs, pgvector, pgvector, chroma, opensearch, tidb_vector, couchbase, vikingdb, upstash, lindorm, oceanbase, opengauss, tablestore
# support: weaviate, qdrant, milvus, myscale, relyt, pgvecto_rs, pgvector, pgvector, chroma, opensearch, tidb_vector, couchbase, vikingdb, upstash, lindorm, oceanbase, opengauss, tablestore, matrixone
VECTOR_STORE=weaviate
# Weaviate configuration
@ -294,6 +294,13 @@ VIKINGDB_SCHEMA=http
VIKINGDB_CONNECTION_TIMEOUT=30
VIKINGDB_SOCKET_TIMEOUT=30
# Matrixone configration
MATRIXONE_HOST=127.0.0.1
MATRIXONE_PORT=6001
MATRIXONE_USER=dump
MATRIXONE_PASSWORD=111
MATRIXONE_DATABASE=dify
# Lindorm configuration
LINDORM_URL=http://ld-*******************-proxy-search-pub.lindorm.aliyuncs.com:30070
LINDORM_USERNAME=admin
@ -332,9 +339,11 @@ PROMPT_GENERATION_MAX_TOKENS=512
CODE_GENERATION_MAX_TOKENS=1024
PLUGIN_BASED_TOKEN_COUNTING_ENABLED=false
# Mail configuration, support: resend, smtp
# Mail configuration, support: resend, smtp, sendgrid
MAIL_TYPE=
# If using SendGrid, use the 'from' field for authentication if necessary.
MAIL_DEFAULT_SEND_FROM=no-reply <no-reply@dify.ai>
# resend configuration
RESEND_API_KEY=
RESEND_API_URL=https://api.resend.com
# smtp configuration
@ -344,7 +353,8 @@ SMTP_USERNAME=123
SMTP_PASSWORD=abc
SMTP_USE_TLS=true
SMTP_OPPORTUNISTIC_TLS=false
# Sendgid configuration
SENDGRID_API_KEY=
# Sentry configuration
SENTRY_DSN=

@ -281,6 +281,7 @@ def migrate_knowledge_vector_database():
VectorType.ELASTICSEARCH,
VectorType.OPENGAUSS,
VectorType.TABLESTORE,
VectorType.MATRIXONE,
}
lower_collection_vector_types = {
VectorType.ANALYTICDB,

@ -609,7 +609,7 @@ class MailConfig(BaseSettings):
"""
MAIL_TYPE: Optional[str] = Field(
description="Email service provider type ('smtp' or 'resend'), default to None.",
description="Email service provider type ('smtp' or 'resend' or 'sendGrid), default to None.",
default=None,
)
@ -663,6 +663,11 @@ class MailConfig(BaseSettings):
default=50,
)
SENDGRID_API_KEY: Optional[str] = Field(
description="API key for SendGrid service",
default=None,
)
class RagEtlConfig(BaseSettings):
"""

@ -24,6 +24,7 @@ from .vdb.couchbase_config import CouchbaseConfig
from .vdb.elasticsearch_config import ElasticsearchConfig
from .vdb.huawei_cloud_config import HuaweiCloudConfig
from .vdb.lindorm_config import LindormConfig
from .vdb.matrixone_config import MatrixoneConfig
from .vdb.milvus_config import MilvusConfig
from .vdb.myscale_config import MyScaleConfig
from .vdb.oceanbase_config import OceanBaseVectorConfig
@ -323,5 +324,6 @@ class MiddlewareConfig(
OpenGaussConfig,
TableStoreConfig,
DatasetQueueMonitorConfig,
MatrixoneConfig,
):
pass

@ -0,0 +1,14 @@
from pydantic import BaseModel, Field
class MatrixoneConfig(BaseModel):
"""Matrixone vector database configuration."""
MATRIXONE_HOST: str = Field(default="localhost", description="Host address of the Matrixone server")
MATRIXONE_PORT: int = Field(default=6001, description="Port number of the Matrixone server")
MATRIXONE_USER: str = Field(default="dump", description="Username for authenticating with Matrixone")
MATRIXONE_PASSWORD: str = Field(default="111", description="Password for authenticating with Matrixone")
MATRIXONE_DATABASE: str = Field(default="dify", description="Name of the Matrixone database to connect to")
MATRIXONE_METRIC: str = Field(
default="l2", description="Distance metric type for vector similarity search (cosine or l2)"
)

@ -56,8 +56,7 @@ class InsertExploreAppListApi(Resource):
parser.add_argument("position", type=int, required=True, nullable=False, location="json")
args = parser.parse_args()
with Session(db.engine) as session:
app = session.execute(select(App).filter(App.id == args["app_id"])).scalar_one_or_none()
app = db.session.execute(select(App).filter(App.id == args["app_id"])).scalar_one_or_none()
if not app:
raise NotFound(f"App '{args['app_id']}' is not found")

@ -208,7 +208,7 @@ class AnnotationBatchImportApi(Resource):
if len(request.files) > 1:
raise TooManyFilesError()
# check file type
if not file.filename or not file.filename.endswith(".csv"):
if not file.filename or not file.filename.lower().endswith(".csv"):
raise ValueError("Invalid file type. Only CSV files are allowed")
return AppAnnotationService.batch_import_app_annotations(app_id, file)

@ -17,6 +17,8 @@ from libs.login import login_required
from models import Account
from models.model import App
from services.app_dsl_service import AppDslService, ImportStatus
from services.enterprise.enterprise_service import EnterpriseService
from services.feature_service import FeatureService
class AppImportApi(Resource):
@ -60,7 +62,9 @@ class AppImportApi(Resource):
app_id=args.get("app_id"),
)
session.commit()
if result.app_id and FeatureService.get_system_features().webapp_auth.enabled:
# update web app setting as private
EnterpriseService.WebAppAuth.update_app_access_mode(result.app_id, "private")
# Return appropriate status code based on result
status = result.status
if status == ImportStatus.FAILED.value:

@ -34,6 +34,20 @@ class WorkflowAppLogApi(Resource):
parser.add_argument(
"created_at__after", type=str, location="args", help="Filter logs created after this timestamp"
)
parser.add_argument(
"created_by_end_user_session_id",
type=str,
location="args",
required=False,
default=None,
)
parser.add_argument(
"created_by_account",
type=str,
location="args",
required=False,
default=None,
)
parser.add_argument("page", type=int_range(1, 99999), default=1, location="args")
parser.add_argument("limit", type=int_range(1, 100), default=20, location="args")
args = parser.parse_args()
@ -57,6 +71,8 @@ class WorkflowAppLogApi(Resource):
created_at_after=args.created_at__after,
page=args.page,
limit=args.limit,
created_by_end_user_session_id=args.created_by_end_user_session_id,
created_by_account=args.created_by_account,
)
return workflow_app_log_pagination

@ -686,6 +686,7 @@ class DatasetRetrievalSettingApi(Resource):
| VectorType.TABLESTORE
| VectorType.HUAWEI_CLOUD
| VectorType.TENCENT
| VectorType.MATRIXONE
):
return {
"retrieval_method": [
@ -733,6 +734,7 @@ class DatasetRetrievalSettingMockApi(Resource):
| VectorType.TABLESTORE
| VectorType.TENCENT
| VectorType.HUAWEI_CLOUD
| VectorType.MATRIXONE
):
return {
"retrieval_method": [

@ -43,7 +43,6 @@ from core.model_runtime.errors.invoke import InvokeAuthorizationError
from core.plugin.impl.exc import PluginDaemonClientSideError
from core.rag.extractor.entity.extract_setting import ExtractSetting
from extensions.ext_database import db
from extensions.ext_redis import redis_client
from fields.document_fields import (
dataset_and_document_fields,
document_fields,
@ -54,8 +53,6 @@ from libs.login import login_required
from models import Dataset, DatasetProcessRule, Document, DocumentSegment, UploadFile
from services.dataset_service import DatasetService, DocumentService
from services.entities.knowledge_entities.knowledge_entities import KnowledgeConfig
from tasks.add_document_to_index_task import add_document_to_index_task
from tasks.remove_document_from_index_task import remove_document_from_index_task
class DocumentResource(Resource):
@ -862,77 +859,16 @@ class DocumentStatusApi(DocumentResource):
DatasetService.check_dataset_permission(dataset, current_user)
document_ids = request.args.getlist("document_id")
for document_id in document_ids:
document = self.get_document(dataset_id, document_id)
indexing_cache_key = "document_{}_indexing".format(document.id)
cache_result = redis_client.get(indexing_cache_key)
if cache_result is not None:
raise InvalidActionError(f"Document:{document.name} is being indexed, please try again later")
if action == "enable":
if document.enabled:
continue
document.enabled = True
document.disabled_at = None
document.disabled_by = None
document.updated_at = datetime.now(UTC).replace(tzinfo=None)
db.session.commit()
# Set cache to prevent indexing the same document multiple times
redis_client.setex(indexing_cache_key, 600, 1)
add_document_to_index_task.delay(document_id)
elif action == "disable":
if not document.completed_at or document.indexing_status != "completed":
raise InvalidActionError(f"Document: {document.name} is not completed.")
if not document.enabled:
continue
document.enabled = False
document.disabled_at = datetime.now(UTC).replace(tzinfo=None)
document.disabled_by = current_user.id
document.updated_at = datetime.now(UTC).replace(tzinfo=None)
db.session.commit()
# Set cache to prevent indexing the same document multiple times
redis_client.setex(indexing_cache_key, 600, 1)
remove_document_from_index_task.delay(document_id)
elif action == "archive":
if document.archived:
continue
document.archived = True
document.archived_at = datetime.now(UTC).replace(tzinfo=None)
document.archived_by = current_user.id
document.updated_at = datetime.now(UTC).replace(tzinfo=None)
db.session.commit()
if document.enabled:
# Set cache to prevent indexing the same document multiple times
redis_client.setex(indexing_cache_key, 600, 1)
remove_document_from_index_task.delay(document_id)
elif action == "un_archive":
if not document.archived:
continue
document.archived = False
document.archived_at = None
document.archived_by = None
document.updated_at = datetime.now(UTC).replace(tzinfo=None)
db.session.commit()
# Set cache to prevent indexing the same document multiple times
redis_client.setex(indexing_cache_key, 600, 1)
add_document_to_index_task.delay(document_id)
try:
DocumentService.batch_update_document_status(dataset, document_ids, action, current_user)
except services.errors.document.DocumentIndexingError as e:
raise InvalidActionError(str(e))
except ValueError as e:
raise InvalidActionError(str(e))
except NotFound as e:
raise NotFound(str(e))
else:
raise InvalidActionError()
return {"result": "success"}, 200

@ -374,7 +374,7 @@ class DatasetDocumentSegmentBatchImportApi(Resource):
if len(request.files) > 1:
raise TooManyFilesError()
# check file type
if not file.filename or not file.filename.endswith(".csv"):
if not file.filename or not file.filename.lower().endswith(".csv"):
raise ValueError("Invalid file type. Only CSV files are allowed")
try:

@ -15,7 +15,7 @@ class LoadBalancingCredentialsValidateApi(Resource):
@login_required
@account_initialization_required
def post(self, provider: str):
if not TenantAccountRole.is_privileged_role(current_user.current_tenant.current_role):
if not TenantAccountRole.is_privileged_role(current_user.current_role):
raise Forbidden()
tenant_id = current_user.current_tenant_id
@ -64,7 +64,7 @@ class LoadBalancingConfigCredentialsValidateApi(Resource):
@login_required
@account_initialization_required
def post(self, provider: str, config_id: str):
if not TenantAccountRole.is_privileged_role(current_user.current_tenant.current_role):
if not TenantAccountRole.is_privileged_role(current_user.current_role):
raise Forbidden()
tenant_id = current_user.current_tenant_id

@ -135,6 +135,20 @@ class WorkflowAppLogApi(Resource):
parser.add_argument("status", type=str, choices=["succeeded", "failed", "stopped"], location="args")
parser.add_argument("created_at__before", type=str, location="args")
parser.add_argument("created_at__after", type=str, location="args")
parser.add_argument(
"created_by_end_user_session_id",
type=str,
location="args",
required=False,
default=None,
)
parser.add_argument(
"created_by_account",
type=str,
location="args",
required=False,
default=None,
)
parser.add_argument("page", type=int_range(1, 99999), default=1, location="args")
parser.add_argument("limit", type=int_range(1, 100), default=20, location="args")
args = parser.parse_args()
@ -158,6 +172,8 @@ class WorkflowAppLogApi(Resource):
created_at_after=args.created_at__after,
page=args.page,
limit=args.limit,
created_by_end_user_session_id=args.created_by_end_user_session_id,
created_by_account=args.created_by_account,
)
return workflow_app_log_pagination

@ -4,8 +4,12 @@ from werkzeug.exceptions import Forbidden, NotFound
import services.dataset_service
from controllers.service_api import api
from controllers.service_api.dataset.error import DatasetInUseError, DatasetNameDuplicateError
from controllers.service_api.wraps import DatasetApiResource, validate_dataset_token
from controllers.service_api.dataset.error import DatasetInUseError, DatasetNameDuplicateError, InvalidActionError
from controllers.service_api.wraps import (
DatasetApiResource,
cloud_edition_billing_rate_limit_check,
validate_dataset_token,
)
from core.model_runtime.entities.model_entities import ModelType
from core.plugin.entities.plugin import ModelProviderID
from core.provider_manager import ProviderManager
@ -13,7 +17,7 @@ from fields.dataset_fields import dataset_detail_fields
from fields.tag_fields import tag_fields
from libs.login import current_user
from models.dataset import Dataset, DatasetPermissionEnum
from services.dataset_service import DatasetPermissionService, DatasetService
from services.dataset_service import DatasetPermissionService, DatasetService, DocumentService
from services.entities.knowledge_entities.knowledge_entities import RetrievalModel
from services.tag_service import TagService
@ -70,6 +74,7 @@ class DatasetListApi(DatasetApiResource):
response = {"data": data, "has_more": len(datasets) == limit, "limit": limit, "total": total, "page": page}
return response, 200
@cloud_edition_billing_rate_limit_check("knowledge", "dataset")
def post(self, tenant_id):
"""Resource for creating datasets."""
parser = reqparse.RequestParser()
@ -193,6 +198,7 @@ class DatasetApi(DatasetApiResource):
return data, 200
@cloud_edition_billing_rate_limit_check("knowledge", "dataset")
def patch(self, _, dataset_id):
dataset_id_str = str(dataset_id)
dataset = DatasetService.get_dataset(dataset_id_str)
@ -293,6 +299,7 @@ class DatasetApi(DatasetApiResource):
return result_data, 200
@cloud_edition_billing_rate_limit_check("knowledge", "dataset")
def delete(self, _, dataset_id):
"""
Deletes a dataset given its ID.
@ -322,6 +329,56 @@ class DatasetApi(DatasetApiResource):
raise DatasetInUseError()
class DocumentStatusApi(DatasetApiResource):
"""Resource for batch document status operations."""
def patch(self, tenant_id, dataset_id, action):
"""
Batch update document status.
Args:
tenant_id: tenant id
dataset_id: dataset id
action: action to perform (enable, disable, archive, un_archive)
Returns:
dict: A dictionary with a key 'result' and a value 'success'
int: HTTP status code 200 indicating that the operation was successful.
Raises:
NotFound: If the dataset with the given ID does not exist.
Forbidden: If the user does not have permission.
InvalidActionError: If the action is invalid or cannot be performed.
"""
dataset_id_str = str(dataset_id)
dataset = DatasetService.get_dataset(dataset_id_str)
if dataset is None:
raise NotFound("Dataset not found.")
# Check user's permission
try:
DatasetService.check_dataset_permission(dataset, current_user)
except services.errors.account.NoPermissionError as e:
raise Forbidden(str(e))
# Check dataset model setting
DatasetService.check_dataset_model_setting(dataset)
# Get document IDs from request body
data = request.get_json()
document_ids = data.get("document_ids", [])
try:
DocumentService.batch_update_document_status(dataset, document_ids, action, current_user)
except services.errors.document.DocumentIndexingError as e:
raise InvalidActionError(str(e))
except ValueError as e:
raise InvalidActionError(str(e))
return {"result": "success"}, 200
class DatasetTagsApi(DatasetApiResource):
@validate_dataset_token
@marshal_with(tag_fields)
@ -450,6 +507,7 @@ class DatasetTagsBindingStatusApi(DatasetApiResource):
api.add_resource(DatasetListApi, "/datasets")
api.add_resource(DatasetApi, "/datasets/<uuid:dataset_id>")
api.add_resource(DocumentStatusApi, "/datasets/<uuid:dataset_id>/documents/status/<string:action>")
api.add_resource(DatasetTagsApi, "/datasets/tags")
api.add_resource(DatasetTagBindingApi, "/datasets/tags/binding")
api.add_resource(DatasetTagUnbindingApi, "/datasets/tags/unbinding")

@ -19,7 +19,11 @@ from controllers.service_api.dataset.error import (
ArchivedDocumentImmutableError,
DocumentIndexingError,
)
from controllers.service_api.wraps import DatasetApiResource, cloud_edition_billing_resource_check
from controllers.service_api.wraps import (
DatasetApiResource,
cloud_edition_billing_rate_limit_check,
cloud_edition_billing_resource_check,
)
from core.errors.error import ProviderTokenNotInitError
from extensions.ext_database import db
from fields.document_fields import document_fields, document_status_fields
@ -35,6 +39,7 @@ class DocumentAddByTextApi(DatasetApiResource):
@cloud_edition_billing_resource_check("vector_space", "dataset")
@cloud_edition_billing_resource_check("documents", "dataset")
@cloud_edition_billing_rate_limit_check("knowledge", "dataset")
def post(self, tenant_id, dataset_id):
"""Create document by text."""
parser = reqparse.RequestParser()
@ -99,6 +104,7 @@ class DocumentUpdateByTextApi(DatasetApiResource):
"""Resource for update documents."""
@cloud_edition_billing_resource_check("vector_space", "dataset")
@cloud_edition_billing_rate_limit_check("knowledge", "dataset")
def post(self, tenant_id, dataset_id, document_id):
"""Update document by text."""
parser = reqparse.RequestParser()
@ -158,6 +164,7 @@ class DocumentAddByFileApi(DatasetApiResource):
@cloud_edition_billing_resource_check("vector_space", "dataset")
@cloud_edition_billing_resource_check("documents", "dataset")
@cloud_edition_billing_rate_limit_check("knowledge", "dataset")
def post(self, tenant_id, dataset_id):
"""Create document by upload file."""
args = {}
@ -232,6 +239,7 @@ class DocumentUpdateByFileApi(DatasetApiResource):
"""Resource for update documents."""
@cloud_edition_billing_resource_check("vector_space", "dataset")
@cloud_edition_billing_rate_limit_check("knowledge", "dataset")
def post(self, tenant_id, dataset_id, document_id):
"""Update document by upload file."""
args = {}
@ -302,6 +310,7 @@ class DocumentUpdateByFileApi(DatasetApiResource):
class DocumentDeleteApi(DatasetApiResource):
@cloud_edition_billing_rate_limit_check("knowledge", "dataset")
def delete(self, tenant_id, dataset_id, document_id):
"""Delete document."""
document_id = str(document_id)

@ -1,9 +1,10 @@
from controllers.console.datasets.hit_testing_base import DatasetsHitTestingBase
from controllers.service_api import api
from controllers.service_api.wraps import DatasetApiResource
from controllers.service_api.wraps import DatasetApiResource, cloud_edition_billing_rate_limit_check
class HitTestingApi(DatasetApiResource, DatasetsHitTestingBase):
@cloud_edition_billing_rate_limit_check("knowledge", "dataset")
def post(self, tenant_id, dataset_id):
dataset_id_str = str(dataset_id)

@ -3,7 +3,7 @@ from flask_restful import marshal, reqparse
from werkzeug.exceptions import NotFound
from controllers.service_api import api
from controllers.service_api.wraps import DatasetApiResource
from controllers.service_api.wraps import DatasetApiResource, cloud_edition_billing_rate_limit_check
from fields.dataset_fields import dataset_metadata_fields
from services.dataset_service import DatasetService
from services.entities.knowledge_entities.knowledge_entities import (
@ -14,6 +14,7 @@ from services.metadata_service import MetadataService
class DatasetMetadataCreateServiceApi(DatasetApiResource):
@cloud_edition_billing_rate_limit_check("knowledge", "dataset")
def post(self, tenant_id, dataset_id):
parser = reqparse.RequestParser()
parser.add_argument("type", type=str, required=True, nullable=True, location="json")
@ -39,6 +40,7 @@ class DatasetMetadataCreateServiceApi(DatasetApiResource):
class DatasetMetadataServiceApi(DatasetApiResource):
@cloud_edition_billing_rate_limit_check("knowledge", "dataset")
def patch(self, tenant_id, dataset_id, metadata_id):
parser = reqparse.RequestParser()
parser.add_argument("name", type=str, required=True, nullable=True, location="json")
@ -54,6 +56,7 @@ class DatasetMetadataServiceApi(DatasetApiResource):
metadata = MetadataService.update_metadata_name(dataset_id_str, metadata_id_str, args.get("name"))
return marshal(metadata, dataset_metadata_fields), 200
@cloud_edition_billing_rate_limit_check("knowledge", "dataset")
def delete(self, tenant_id, dataset_id, metadata_id):
dataset_id_str = str(dataset_id)
metadata_id_str = str(metadata_id)
@ -73,6 +76,7 @@ class DatasetMetadataBuiltInFieldServiceApi(DatasetApiResource):
class DatasetMetadataBuiltInFieldActionServiceApi(DatasetApiResource):
@cloud_edition_billing_rate_limit_check("knowledge", "dataset")
def post(self, tenant_id, dataset_id, action):
dataset_id_str = str(dataset_id)
dataset = DatasetService.get_dataset(dataset_id_str)
@ -88,6 +92,7 @@ class DatasetMetadataBuiltInFieldActionServiceApi(DatasetApiResource):
class DocumentMetadataEditServiceApi(DatasetApiResource):
@cloud_edition_billing_rate_limit_check("knowledge", "dataset")
def post(self, tenant_id, dataset_id):
dataset_id_str = str(dataset_id)
dataset = DatasetService.get_dataset(dataset_id_str)

@ -8,6 +8,7 @@ from controllers.service_api.app.error import ProviderNotInitializeError
from controllers.service_api.wraps import (
DatasetApiResource,
cloud_edition_billing_knowledge_limit_check,
cloud_edition_billing_rate_limit_check,
cloud_edition_billing_resource_check,
)
from core.errors.error import LLMBadRequestError, ProviderTokenNotInitError
@ -35,6 +36,7 @@ class SegmentApi(DatasetApiResource):
@cloud_edition_billing_resource_check("vector_space", "dataset")
@cloud_edition_billing_knowledge_limit_check("add_segment", "dataset")
@cloud_edition_billing_rate_limit_check("knowledge", "dataset")
def post(self, tenant_id, dataset_id, document_id):
"""Create single segment."""
# check dataset
@ -139,6 +141,7 @@ class SegmentApi(DatasetApiResource):
class DatasetSegmentApi(DatasetApiResource):
@cloud_edition_billing_rate_limit_check("knowledge", "dataset")
def delete(self, tenant_id, dataset_id, document_id, segment_id):
# check dataset
dataset_id = str(dataset_id)
@ -162,6 +165,7 @@ class DatasetSegmentApi(DatasetApiResource):
return 204
@cloud_edition_billing_resource_check("vector_space", "dataset")
@cloud_edition_billing_rate_limit_check("knowledge", "dataset")
def post(self, tenant_id, dataset_id, document_id, segment_id):
# check dataset
dataset_id = str(dataset_id)
@ -236,6 +240,7 @@ class ChildChunkApi(DatasetApiResource):
@cloud_edition_billing_resource_check("vector_space", "dataset")
@cloud_edition_billing_knowledge_limit_check("add_segment", "dataset")
@cloud_edition_billing_rate_limit_check("knowledge", "dataset")
def post(self, tenant_id, dataset_id, document_id, segment_id):
"""Create child chunk."""
# check dataset
@ -332,6 +337,7 @@ class DatasetChildChunkApi(DatasetApiResource):
"""Resource for updating child chunks."""
@cloud_edition_billing_knowledge_limit_check("add_segment", "dataset")
@cloud_edition_billing_rate_limit_check("knowledge", "dataset")
def delete(self, tenant_id, dataset_id, document_id, segment_id, child_chunk_id):
"""Delete child chunk."""
# check dataset
@ -370,6 +376,7 @@ class DatasetChildChunkApi(DatasetApiResource):
@cloud_edition_billing_resource_check("vector_space", "dataset")
@cloud_edition_billing_knowledge_limit_check("add_segment", "dataset")
@cloud_edition_billing_rate_limit_check("knowledge", "dataset")
def patch(self, tenant_id, dataset_id, document_id, segment_id, child_chunk_id):
"""Update child chunk."""
# check dataset

@ -163,7 +163,7 @@ def exchange_token_for_existing_web_user(app_code: str, enterprise_user_decoded:
)
db.session.add(end_user)
db.session.commit()
exp_dt = datetime.now(UTC) + timedelta(hours=dify_config.ACCESS_TOKEN_EXPIRE_MINUTES * 24)
exp_dt = datetime.now(UTC) + timedelta(minutes=dify_config.ACCESS_TOKEN_EXPIRE_MINUTES)
exp = int(exp_dt.timestamp())
payload = {
"iss": site.id,

@ -138,14 +138,11 @@ class DatasetConfigManager:
if not config.get("dataset_configs"):
config["dataset_configs"] = {"retrieval_model": "single"}
if not config["dataset_configs"].get("datasets"):
config["dataset_configs"]["datasets"] = {"strategy": "router", "datasets": []}
if not isinstance(config["dataset_configs"], dict):
raise ValueError("dataset_configs must be of object type")
if not isinstance(config["dataset_configs"], dict):
raise ValueError("dataset_configs must be of object type")
if not config["dataset_configs"].get("datasets"):
config["dataset_configs"]["datasets"] = {"strategy": "router", "datasets": []}
need_manual_query_datasets = config.get("dataset_configs") and config["dataset_configs"].get(
"datasets", {}

@ -5,7 +5,7 @@ import uuid
from collections.abc import Generator, Mapping
from typing import Any, Literal, Optional, Union, overload
from flask import Flask, copy_current_request_context, current_app, has_request_context
from flask import Flask, current_app
from pydantic import ValidationError
from sqlalchemy.orm import sessionmaker
@ -31,6 +31,7 @@ from core.workflow.repositories.workflow_execution_repository import WorkflowExe
from core.workflow.repositories.workflow_node_execution_repository import WorkflowNodeExecutionRepository
from extensions.ext_database import db
from factories import file_factory
from libs.flask_utils import preserve_flask_contexts
from models import Account, App, Conversation, EndUser, Message, Workflow, WorkflowNodeExecutionTriggeredFrom
from models.enums import WorkflowRunTriggeredFrom
from services.conversation_service import ConversationService
@ -366,6 +367,7 @@ class AdvancedChatAppGenerator(MessageBasedAppGenerator):
:param user: account or end user
:param invoke_from: invoke from source
:param application_generate_entity: application generate entity
:param workflow_execution_repository: repository for workflow execution
:param workflow_node_execution_repository: repository for workflow node execution
:param conversation: conversation
:param stream: is stream
@ -399,21 +401,18 @@ class AdvancedChatAppGenerator(MessageBasedAppGenerator):
# new thread with request context and contextvars
context = contextvars.copy_context()
@copy_current_request_context
def worker_with_context():
# Run the worker within the copied context
return context.run(
self._generate_worker,
flask_app=current_app._get_current_object(), # type: ignore
application_generate_entity=application_generate_entity,
queue_manager=queue_manager,
conversation_id=conversation.id,
message_id=message.id,
context=context,
worker_thread = threading.Thread(
target=self._generate_worker,
kwargs={
"flask_app": current_app._get_current_object(), # type: ignore
"application_generate_entity": application_generate_entity,
"queue_manager": queue_manager,
"conversation_id": conversation.id,
"message_id": message.id,
"context": context,
},
)
worker_thread = threading.Thread(target=worker_with_context)
worker_thread.start()
# return response or stream generator
@ -449,24 +448,9 @@ class AdvancedChatAppGenerator(MessageBasedAppGenerator):
:param message_id: message ID
:return:
"""
for var, val in context.items():
var.set(val)
# FIXME(-LAN-): Save current user before entering new app context
from flask import g
saved_user = None
if has_request_context() and hasattr(g, "_login_user"):
saved_user = g._login_user
with flask_app.app_context():
with preserve_flask_contexts(flask_app, context_vars=context):
try:
# Restore user in new app context
if saved_user is not None:
from flask import g
g._login_user = saved_user
# get conversation and message
conversation = self._get_conversation(conversation_id)
message = self._get_message(message_id)

@ -5,7 +5,7 @@ import uuid
from collections.abc import Generator, Mapping
from typing import Any, Literal, Union, overload
from flask import Flask, copy_current_request_context, current_app, has_request_context
from flask import Flask, current_app
from pydantic import ValidationError
from configs import dify_config
@ -23,6 +23,7 @@ from core.model_runtime.errors.invoke import InvokeAuthorizationError
from core.ops.ops_trace_manager import TraceQueueManager
from extensions.ext_database import db
from factories import file_factory
from libs.flask_utils import preserve_flask_contexts
from models import Account, App, EndUser
from services.conversation_service import ConversationService
from services.errors.message import MessageNotExistsError
@ -182,21 +183,18 @@ class AgentChatAppGenerator(MessageBasedAppGenerator):
# new thread with request context and contextvars
context = contextvars.copy_context()
@copy_current_request_context
def worker_with_context():
# Run the worker within the copied context
return context.run(
self._generate_worker,
flask_app=current_app._get_current_object(), # type: ignore
context=context,
application_generate_entity=application_generate_entity,
queue_manager=queue_manager,
conversation_id=conversation.id,
message_id=message.id,
worker_thread = threading.Thread(
target=self._generate_worker,
kwargs={
"flask_app": current_app._get_current_object(), # type: ignore
"context": context,
"application_generate_entity": application_generate_entity,
"queue_manager": queue_manager,
"conversation_id": conversation.id,
"message_id": message.id,
},
)
worker_thread = threading.Thread(target=worker_with_context)
worker_thread.start()
# return response or stream generator
@ -229,24 +227,9 @@ class AgentChatAppGenerator(MessageBasedAppGenerator):
:param message_id: message ID
:return:
"""
for var, val in context.items():
var.set(val)
# FIXME(-LAN-): Save current user before entering new app context
from flask import g
saved_user = None
if has_request_context() and hasattr(g, "_login_user"):
saved_user = g._login_user
with flask_app.app_context():
with preserve_flask_contexts(flask_app, context_vars=context):
try:
# Restore user in new app context
if saved_user is not None:
from flask import g
g._login_user = saved_user
# get conversation and message
conversation = self._get_conversation(conversation_id)
message = self._get_message(message_id)

@ -5,7 +5,7 @@ import uuid
from collections.abc import Generator, Mapping, Sequence
from typing import Any, Literal, Optional, Union, overload
from flask import Flask, copy_current_request_context, current_app, has_request_context
from flask import Flask, current_app
from pydantic import ValidationError
from sqlalchemy.orm import sessionmaker
@ -29,6 +29,7 @@ from core.workflow.repositories.workflow_execution_repository import WorkflowExe
from core.workflow.repositories.workflow_node_execution_repository import WorkflowNodeExecutionRepository
from extensions.ext_database import db
from factories import file_factory
from libs.flask_utils import preserve_flask_contexts
from models import Account, App, EndUser, Workflow, WorkflowNodeExecutionTriggeredFrom
from models.enums import WorkflowRunTriggeredFrom
@ -194,6 +195,7 @@ class WorkflowAppGenerator(BaseAppGenerator):
:param user: account or end user
:param application_generate_entity: application generate entity
:param invoke_from: invoke from source
:param workflow_execution_repository: repository for workflow execution
:param workflow_node_execution_repository: repository for workflow node execution
:param streaming: is stream
:param workflow_thread_pool_id: workflow thread pool id
@ -209,20 +211,17 @@ class WorkflowAppGenerator(BaseAppGenerator):
# new thread with request context and contextvars
context = contextvars.copy_context()
@copy_current_request_context
def worker_with_context():
# Run the worker within the copied context
return context.run(
self._generate_worker,
flask_app=current_app._get_current_object(), # type: ignore
application_generate_entity=application_generate_entity,
queue_manager=queue_manager,
context=context,
workflow_thread_pool_id=workflow_thread_pool_id,
worker_thread = threading.Thread(
target=self._generate_worker,
kwargs={
"flask_app": current_app._get_current_object(), # type: ignore
"application_generate_entity": application_generate_entity,
"queue_manager": queue_manager,
"context": context,
"workflow_thread_pool_id": workflow_thread_pool_id,
},
)
worker_thread = threading.Thread(target=worker_with_context)
worker_thread.start()
# return response or stream generator
@ -408,24 +407,9 @@ class WorkflowAppGenerator(BaseAppGenerator):
:param workflow_thread_pool_id: workflow thread pool id
:return:
"""
for var, val in context.items():
var.set(val)
# FIXME(-LAN-): Save current user before entering new app context
from flask import g
saved_user = None
if has_request_context() and hasattr(g, "_login_user"):
saved_user = g._login_user
with flask_app.app_context():
with preserve_flask_contexts(flask_app, context_vars=context):
try:
# Restore user in new app context
if saved_user is not None:
from flask import g
g._login_user = saved_user
# workflow app
runner = WorkflowAppRunner(
application_generate_entity=application_generate_entity,

@ -542,8 +542,6 @@ class LBModelManager:
return config
return None
def cooldown(self, config: ModelLoadBalancingConfiguration, expire: int = 60) -> None:
"""
Cooldown model load balancing config

@ -251,7 +251,7 @@ class OpsTraceManager:
provider_config_map[tracing_provider]["trace_instance"],
provider_config_map[tracing_provider]["config_class"],
)
decrypt_trace_config_key = str(decrypt_trace_config)
decrypt_trace_config_key = json.dumps(decrypt_trace_config, sort_keys=True)
tracing_instance = cls.ops_trace_instances_cache.get(decrypt_trace_config_key)
if tracing_instance is None:
# create new tracing_instance and update the cache if it absent

@ -156,9 +156,23 @@ class PluginInstallTaskStartResponse(BaseModel):
task_id: str = Field(description="The ID of the install task.")
class PluginUploadResponse(BaseModel):
class PluginVerification(BaseModel):
"""
Verification of the plugin.
"""
class AuthorizedCategory(StrEnum):
Langgenius = "langgenius"
Partner = "partner"
Community = "community"
authorized_category: AuthorizedCategory = Field(description="The authorized category of the plugin.")
class PluginDecodeResponse(BaseModel):
unique_identifier: str = Field(description="The unique identifier of the plugin.")
manifest: PluginDeclaration
verification: Optional[PluginVerification] = Field(default=None, description="Basic verification information")
class PluginOAuthAuthorizationUrlResponse(BaseModel):

@ -10,10 +10,10 @@ from core.plugin.entities.plugin import (
PluginInstallationSource,
)
from core.plugin.entities.plugin_daemon import (
PluginDecodeResponse,
PluginInstallTask,
PluginInstallTaskStartResponse,
PluginListResponse,
PluginUploadResponse,
)
from core.plugin.impl.base import BasePluginClient
@ -53,7 +53,7 @@ class PluginInstaller(BasePluginClient):
tenant_id: str,
pkg: bytes,
verify_signature: bool = False,
) -> PluginUploadResponse:
) -> PluginDecodeResponse:
"""
Upload a plugin package and return the plugin unique identifier.
"""
@ -68,7 +68,7 @@ class PluginInstaller(BasePluginClient):
return self._request_with_plugin_daemon_response(
"POST",
f"plugin/{tenant_id}/management/install/upload/package",
PluginUploadResponse,
PluginDecodeResponse,
files=body,
data=data,
)
@ -176,6 +176,18 @@ class PluginInstaller(BasePluginClient):
params={"plugin_unique_identifier": plugin_unique_identifier},
)
def decode_plugin_from_identifier(self, tenant_id: str, plugin_unique_identifier: str) -> PluginDecodeResponse:
"""
Decode a plugin from an identifier.
"""
return self._request_with_plugin_daemon_response(
"GET",
f"plugin/{tenant_id}/management/decode/from_identifier",
PluginDecodeResponse,
data={"plugin_unique_identifier": plugin_unique_identifier},
headers={"Content-Type": "application/json"},
)
def fetch_plugin_installation_by_ids(
self, tenant_id: str, plugin_ids: Sequence[str]
) -> Sequence[PluginInstallation]:

@ -0,0 +1,233 @@
import json
import logging
import uuid
from functools import wraps
from typing import Any, Optional
from mo_vector.client import MoVectorClient # type: ignore
from pydantic import BaseModel, model_validator
from configs import dify_config
from core.rag.datasource.vdb.vector_base import BaseVector
from core.rag.datasource.vdb.vector_factory import AbstractVectorFactory
from core.rag.datasource.vdb.vector_type import VectorType
from core.rag.embedding.embedding_base import Embeddings
from core.rag.models.document import Document
from extensions.ext_redis import redis_client
from models.dataset import Dataset
logger = logging.getLogger(__name__)
class MatrixoneConfig(BaseModel):
host: str = "localhost"
port: int = 6001
user: str = "dump"
password: str = "111"
database: str = "dify"
metric: str = "l2"
@model_validator(mode="before")
@classmethod
def validate_config(cls, values: dict) -> dict:
if not values["host"]:
raise ValueError("config host is required")
if not values["port"]:
raise ValueError("config port is required")
if not values["user"]:
raise ValueError("config user is required")
if not values["password"]:
raise ValueError("config password is required")
if not values["database"]:
raise ValueError("config database is required")
return values
def ensure_client(func):
@wraps(func)
def wrapper(self, *args, **kwargs):
if self.client is None:
self.client = self._get_client(None, False)
return func(self, *args, **kwargs)
return wrapper
class MatrixoneVector(BaseVector):
"""
Matrixone vector storage implementation.
"""
def __init__(self, collection_name: str, config: MatrixoneConfig):
super().__init__(collection_name)
self.config = config
self.collection_name = collection_name.lower()
self.client = None
@property
def collection_name(self):
return self._collection_name
@collection_name.setter
def collection_name(self, value):
self._collection_name = value
def get_type(self) -> str:
return VectorType.MATRIXONE
def create(self, texts: list[Document], embeddings: list[list[float]], **kwargs):
if self.client is None:
self.client = self._get_client(len(embeddings[0]), True)
return self.add_texts(texts, embeddings)
def _get_client(self, dimension: Optional[int] = None, create_table: bool = False) -> MoVectorClient:
"""
Create a new client for the collection.
The collection will be created if it doesn't exist.
"""
lock_name = f"vector_indexing_lock_{self._collection_name}"
with redis_client.lock(lock_name, timeout=20):
client = MoVectorClient(
connection_string=f"mysql+pymysql://{self.config.user}:{self.config.password}@{self.config.host}:{self.config.port}/{self.config.database}",
table_name=self.collection_name,
vector_dimension=dimension,
create_table=create_table,
)
collection_exist_cache_key = f"vector_indexing_{self._collection_name}"
if redis_client.get(collection_exist_cache_key):
return client
try:
client.create_full_text_index()
except Exception as e:
logger.exception("Failed to create full text index")
redis_client.set(collection_exist_cache_key, 1, ex=3600)
return client
def add_texts(self, documents: list[Document], embeddings: list[list[float]], **kwargs):
if self.client is None:
self.client = self._get_client(len(embeddings[0]), True)
assert self.client is not None
ids = []
for _, doc in enumerate(documents):
if doc.metadata is not None:
doc_id = doc.metadata.get("doc_id", str(uuid.uuid4()))
ids.append(doc_id)
self.client.insert(
texts=[doc.page_content for doc in documents],
embeddings=embeddings,
metadatas=[doc.metadata for doc in documents],
ids=ids,
)
return ids
@ensure_client
def text_exists(self, id: str) -> bool:
assert self.client is not None
result = self.client.get(ids=[id])
return len(result) > 0
@ensure_client
def delete_by_ids(self, ids: list[str]) -> None:
assert self.client is not None
if not ids:
return
self.client.delete(ids=ids)
@ensure_client
def get_ids_by_metadata_field(self, key: str, value: str):
assert self.client is not None
results = self.client.query_by_metadata(filter={key: value})
return [result.id for result in results]
@ensure_client
def delete_by_metadata_field(self, key: str, value: str) -> None:
assert self.client is not None
self.client.delete(filter={key: value})
@ensure_client
def search_by_vector(self, query_vector: list[float], **kwargs: Any) -> list[Document]:
assert self.client is not None
top_k = kwargs.get("top_k", 5)
document_ids_filter = kwargs.get("document_ids_filter")
filter = None
if document_ids_filter:
filter = {"document_id": {"$in": document_ids_filter}}
results = self.client.query(
query_vector=query_vector,
k=top_k,
filter=filter,
)
docs = []
# TODO: add the score threshold to the query
for result in results:
metadata = result.metadata
docs.append(
Document(
page_content=result.document,
metadata=metadata,
)
)
return docs
@ensure_client
def search_by_full_text(self, query: str, **kwargs: Any) -> list[Document]:
assert self.client is not None
top_k = kwargs.get("top_k", 5)
document_ids_filter = kwargs.get("document_ids_filter")
filter = None
if document_ids_filter:
filter = {"document_id": {"$in": document_ids_filter}}
score_threshold = float(kwargs.get("score_threshold", 0.0))
results = self.client.full_text_query(
keywords=[query],
k=top_k,
filter=filter,
)
docs = []
for result in results:
metadata = result.metadata
if isinstance(metadata, str):
import json
metadata = json.loads(metadata)
score = 1 - result.distance
if score >= score_threshold:
metadata["score"] = score
docs.append(
Document(
page_content=result.document,
metadata=metadata,
)
)
return docs
@ensure_client
def delete(self) -> None:
assert self.client is not None
self.client.delete()
class MatrixoneVectorFactory(AbstractVectorFactory):
def init_vector(self, dataset: Dataset, attributes: list, embeddings: Embeddings) -> MatrixoneVector:
if dataset.index_struct_dict:
class_prefix: str = dataset.index_struct_dict["vector_store"]["class_prefix"]
collection_name = class_prefix
else:
dataset_id = dataset.id
collection_name = Dataset.gen_collection_name_by_id(dataset_id)
dataset.index_struct = json.dumps(self.gen_index_struct_dict(VectorType.MATRIXONE, collection_name))
config = MatrixoneConfig(
host=dify_config.MATRIXONE_HOST or "localhost",
port=dify_config.MATRIXONE_PORT or 6001,
user=dify_config.MATRIXONE_USER or "dump",
password=dify_config.MATRIXONE_PASSWORD or "111",
database=dify_config.MATRIXONE_DATABASE or "dify",
metric=dify_config.MATRIXONE_METRIC or "l2",
)
return MatrixoneVector(collection_name=collection_name, config=config)

@ -164,6 +164,10 @@ class Vector:
from core.rag.datasource.vdb.huawei.huawei_cloud_vector import HuaweiCloudVectorFactory
return HuaweiCloudVectorFactory
case VectorType.MATRIXONE:
from core.rag.datasource.vdb.matrixone.matrixone_vector import MatrixoneVectorFactory
return MatrixoneVectorFactory
case _:
raise ValueError(f"Vector store {vector_type} is not supported.")

@ -29,3 +29,4 @@ class VectorType(StrEnum):
OPENGAUSS = "opengauss"
TABLESTORE = "tablestore"
HUAWEI_CLOUD = "huawei_cloud"
MATRIXONE = "matrixone"

@ -41,6 +41,13 @@ class WeaviateVector(BaseVector):
weaviate.connect.connection.has_grpc = False
# Fix to minimize the performance impact of the deprecation check in weaviate-client 3.24.0,
# by changing the connection timeout to pypi.org from 1 second to 0.001 seconds.
# TODO: This can be removed once weaviate-client is updated to 3.26.7 or higher,
# which does not contain the deprecation check.
if hasattr(weaviate.connect.connection, "PYPI_TIMEOUT"):
weaviate.connect.connection.PYPI_TIMEOUT = 0.001
try:
client = weaviate.Client(
url=config.endpoint, auth_client_secret=auth_config, timeout_config=(5, 60), startup_period=None

@ -22,6 +22,7 @@ class FirecrawlApp:
"formats": ["markdown"],
"onlyMainContent": True,
"timeout": 30000,
"integration": "dify",
}
if params:
json_data.update(params)
@ -39,7 +40,7 @@ class FirecrawlApp:
def crawl_url(self, url, params=None) -> str:
# Documentation: https://docs.firecrawl.dev/api-reference/endpoint/crawl-post
headers = self._prepare_headers()
json_data = {"url": url}
json_data = {"url": url, "integration": "dify"}
if params:
json_data.update(params)
response = self._post_request(f"{self.base_url}/v1/crawl", json_data, headers)
@ -49,7 +50,6 @@ class FirecrawlApp:
return cast(str, job_id)
else:
self._handle_error(response, "start crawl job")
# FIXME: unreachable code for mypy
return "" # unreachable
def check_crawl_status(self, job_id) -> dict[str, Any]:
@ -82,7 +82,6 @@ class FirecrawlApp:
)
else:
self._handle_error(response, "check crawl status")
# FIXME: unreachable code for mypy
return {} # unreachable
def _format_crawl_status_response(
@ -126,4 +125,31 @@ class FirecrawlApp:
def _handle_error(self, response, action) -> None:
error_message = response.json().get("error", "Unknown error occurred")
raise Exception(f"Failed to {action}. Status code: {response.status_code}. Error: {error_message}")
raise Exception(f"Failed to {action}. Status code: {response.status_code}. Error: {error_message}") # type: ignore[return]
def search(self, query: str, params: dict[str, Any] | None = None) -> dict[str, Any]:
# Documentation: https://docs.firecrawl.dev/api-reference/endpoint/search
headers = self._prepare_headers()
json_data = {
"query": query,
"limit": 5,
"lang": "en",
"country": "us",
"timeout": 60000,
"ignoreInvalidURLs": False,
"scrapeOptions": {},
"integration": "dify",
}
if params:
json_data.update(params)
response = self._post_request(f"{self.base_url}/v1/search", json_data, headers)
if response.status_code == 200:
response_data = response.json()
if not response_data.get("success"):
raise Exception(f"Search failed. Error: {response_data.get('warning', 'Unknown error')}")
return cast(dict[str, Any], response_data)
elif response.status_code in {402, 409, 500, 429, 408}:
self._handle_error(response, "perform search")
return {} # Avoid additional exception after handling error
else:
raise Exception(f"Failed to perform search. Status code: {response.status_code}")

@ -79,6 +79,16 @@ class NotionExtractor(BaseExtractor):
def _get_notion_database_data(self, database_id: str, query_dict: dict[str, Any] = {}) -> list[Document]:
"""Get all the pages from a Notion database."""
assert self._notion_access_token is not None, "Notion access token is required"
database_content = []
next_cursor = None
has_more = True
while has_more:
current_query = query_dict.copy()
if next_cursor:
current_query["start_cursor"] = next_cursor
res = requests.post(
DATABASE_URL_TMPL.format(database_id=database_id),
headers={
@ -86,15 +96,15 @@ class NotionExtractor(BaseExtractor):
"Content-Type": "application/json",
"Notion-Version": "2022-06-28",
},
json=query_dict,
json=current_query,
)
data = res.json()
response_data = res.json()
database_content = []
if "results" not in data or data["results"] is None:
return []
for result in data["results"]:
if "results" not in response_data or response_data["results"] is None:
break
for result in response_data["results"]:
properties = result["properties"]
data = {}
value: Any
@ -129,6 +139,12 @@ class NotionExtractor(BaseExtractor):
row_content = row_content + f"{key}:{value}\n"
database_content.append(row_content)
has_more = response_data.get("has_more", False)
next_cursor = response_data.get("next_cursor")
if not database_content:
return []
return [Document(page_content="\n".join(database_content))]
def _get_notion_block_data(self, page_id: str) -> list[str]:

@ -104,7 +104,7 @@ class QAIndexProcessor(BaseIndexProcessor):
def format_by_template(self, file: FileStorage, **kwargs) -> list[Document]:
# check file type
if not file.filename or not file.filename.endswith(".csv"):
if not file.filename or not file.filename.lower().endswith(".csv"):
raise ValueError("Invalid file type. Only CSV files are allowed")
try:

@ -496,6 +496,8 @@ class DatasetRetrieval:
all_documents = self.calculate_keyword_score(query, all_documents, top_k)
elif index_type == "high_quality":
all_documents = self.calculate_vector_score(all_documents, top_k, score_threshold)
else:
all_documents = all_documents[:top_k] if top_k else all_documents
self._on_query(query, dataset_ids, app_id, user_from, user_id)

@ -6,7 +6,7 @@ import json
import logging
from typing import Optional, Union
from sqlalchemy import func, select
from sqlalchemy import select
from sqlalchemy.engine import Engine
from sqlalchemy.orm import sessionmaker
@ -146,20 +146,7 @@ class SQLAlchemyWorkflowExecutionRepository(WorkflowExecutionRepository):
db_model.workflow_id = domain_model.workflow_id
db_model.triggered_from = self._triggered_from
# Check if this is a new record
with self._session_factory() as session:
existing = session.scalar(select(WorkflowRun).where(WorkflowRun.id == domain_model.id_))
if not existing:
# For new records, get the next sequence number
stmt = select(func.max(WorkflowRun.sequence_number)).where(
WorkflowRun.app_id == self._app_id,
WorkflowRun.tenant_id == self._tenant_id,
)
max_sequence = session.scalar(stmt)
db_model.sequence_number = (max_sequence or 0) + 1
else:
# For updates, keep the existing sequence number
db_model.sequence_number = existing.sequence_number
# No sequence number generation needed anymore
db_model.type = domain_model.workflow_type
db_model.version = domain_model.workflow_version

@ -9,7 +9,7 @@ from copy import copy, deepcopy
from datetime import UTC, datetime
from typing import Any, Optional, cast
from flask import Flask, current_app, has_request_context
from flask import Flask, current_app
from configs import dify_config
from core.app.apps.base_app_queue_manager import GenerateTaskStoppedError
@ -53,6 +53,7 @@ from core.workflow.nodes.end.end_stream_processor import EndStreamProcessor
from core.workflow.nodes.enums import ErrorStrategy, FailBranchSourceHandle
from core.workflow.nodes.event import RunCompletedEvent, RunRetrieverResourceEvent, RunStreamChunkEvent
from core.workflow.nodes.node_mapping import NODE_TYPE_CLASSES_MAPPING
from libs.flask_utils import preserve_flask_contexts
from models.enums import UserFrom
from models.workflow import WorkflowType
@ -537,24 +538,9 @@ class GraphEngine:
"""
Run parallel nodes
"""
for var, val in context.items():
var.set(val)
# FIXME(-LAN-): Save current user before entering new app context
from flask import g
saved_user = None
if has_request_context() and hasattr(g, "_login_user"):
saved_user = g._login_user
with flask_app.app_context():
with preserve_flask_contexts(flask_app, context_vars=context):
try:
# Restore user in new app context
if saved_user is not None:
from flask import g
g._login_user = saved_user
q.put(
ParallelBranchRunStartedEvent(
parallel_id=parallel_id,
@ -653,26 +639,19 @@ class GraphEngine:
retry_start_at = datetime.now(UTC).replace(tzinfo=None)
# yield control to other threads
time.sleep(0.001)
generator = node_instance.run()
for item in generator:
if isinstance(item, GraphEngineEvent):
if isinstance(item, BaseIterationEvent):
event_stream = node_instance.run()
for event in event_stream:
if isinstance(event, GraphEngineEvent):
# add parallel info to iteration event
item.parallel_id = parallel_id
item.parallel_start_node_id = parallel_start_node_id
item.parent_parallel_id = parent_parallel_id
item.parent_parallel_start_node_id = parent_parallel_start_node_id
elif isinstance(item, BaseLoopEvent):
# add parallel info to loop event
item.parallel_id = parallel_id
item.parallel_start_node_id = parallel_start_node_id
item.parent_parallel_id = parent_parallel_id
item.parent_parallel_start_node_id = parent_parallel_start_node_id
yield item
if isinstance(event, BaseIterationEvent | BaseLoopEvent):
event.parallel_id = parallel_id
event.parallel_start_node_id = parallel_start_node_id
event.parent_parallel_id = parent_parallel_id
event.parent_parallel_start_node_id = parent_parallel_start_node_id
yield event
else:
if isinstance(item, RunCompletedEvent):
run_result = item.run_result
if isinstance(event, RunCompletedEvent):
run_result = event.run_result
if run_result.status == WorkflowNodeExecutionStatus.FAILED:
if (
retries == max_retries
@ -708,7 +687,7 @@ class GraphEngine:
# if run failed, handle error
run_result = self._handle_continue_on_error(
node_instance,
item.run_result,
event.run_result,
self.graph_runtime_state.variable_pool,
handle_exceptions=handle_exceptions,
)
@ -811,28 +790,28 @@ class GraphEngine:
should_continue_retry = False
break
elif isinstance(item, RunStreamChunkEvent):
elif isinstance(event, RunStreamChunkEvent):
yield NodeRunStreamChunkEvent(
id=node_instance.id,
node_id=node_instance.node_id,
node_type=node_instance.node_type,
node_data=node_instance.node_data,
chunk_content=item.chunk_content,
from_variable_selector=item.from_variable_selector,
chunk_content=event.chunk_content,
from_variable_selector=event.from_variable_selector,
route_node_state=route_node_state,
parallel_id=parallel_id,
parallel_start_node_id=parallel_start_node_id,
parent_parallel_id=parent_parallel_id,
parent_parallel_start_node_id=parent_parallel_start_node_id,
)
elif isinstance(item, RunRetrieverResourceEvent):
elif isinstance(event, RunRetrieverResourceEvent):
yield NodeRunRetrieverResourceEvent(
id=node_instance.id,
node_id=node_instance.node_id,
node_type=node_instance.node_type,
node_data=node_instance.node_data,
retriever_resources=item.retriever_resources,
context=item.context,
retriever_resources=event.retriever_resources,
context=event.context,
route_node_state=route_node_state,
parallel_id=parallel_id,
parallel_start_node_id=parallel_start_node_id,

@ -214,7 +214,7 @@ class AgentNode(ToolNode):
)
if tool_runtime.entity.description:
tool_runtime.entity.description.llm = (
extra.get("descrption", "") or tool_runtime.entity.description.llm
extra.get("description", "") or tool_runtime.entity.description.llm
)
for tool_runtime_params in tool_runtime.entity.parameters:
tool_runtime_params.form = (

@ -57,7 +57,6 @@ class StreamProcessor(ABC):
# The branch_identify parameter is added to ensure that
# only nodes in the correct logical branch are included.
reachable_node_ids.append(edge.target_node_id)
ids = self._fetch_node_ids_in_reachable_branch(edge.target_node_id, run_result.edge_source_handle)
reachable_node_ids.extend(ids)
else:
@ -74,6 +73,8 @@ class StreamProcessor(ABC):
self._remove_node_ids_in_unreachable_branch(node_id, reachable_node_ids)
def _fetch_node_ids_in_reachable_branch(self, node_id: str, branch_identify: Optional[str] = None) -> list[str]:
if node_id not in self.rest_node_ids:
self.rest_node_ids.append(node_id)
node_ids = []
for edge in self.graph.edge_mapping.get(node_id, []):
if edge.target_node_id == self.graph.root_node_id:

@ -7,7 +7,7 @@ from datetime import UTC, datetime
from queue import Empty, Queue
from typing import TYPE_CHECKING, Any, Optional, cast
from flask import Flask, current_app, has_request_context
from flask import Flask, current_app
from configs import dify_config
from core.variables import ArrayVariable, IntegerVariable, NoneVariable
@ -37,6 +37,7 @@ from core.workflow.nodes.base import BaseNode
from core.workflow.nodes.enums import NodeType
from core.workflow.nodes.event import NodeEvent, RunCompletedEvent
from core.workflow.nodes.iteration.entities import ErrorHandleMode, IterationNodeData
from libs.flask_utils import preserve_flask_contexts
from .exc import (
InvalidIteratorValueError,
@ -583,23 +584,8 @@ class IterationNode(BaseNode[IterationNodeData]):
"""
run single iteration in parallel mode
"""
for var, val in context.items():
var.set(val)
# FIXME(-LAN-): Save current user before entering new app context
from flask import g
saved_user = None
if has_request_context() and hasattr(g, "_login_user"):
saved_user = g._login_user
with flask_app.app_context():
# Restore user in new app context
if saved_user is not None:
from flask import g
g._login_user = saved_user
with preserve_flask_contexts(flask_app, context_vars=context):
parallel_mode_run_id = uuid.uuid4().hex
graph_engine_copy = graph_engine.create_copy()
variable_pool_copy = graph_engine_copy.graph_runtime_state.variable_pool

@ -54,6 +54,15 @@ class Mail:
use_tls=dify_config.SMTP_USE_TLS,
opportunistic_tls=dify_config.SMTP_OPPORTUNISTIC_TLS,
)
case "sendgrid":
from libs.sendgrid import SendGridClient
if not dify_config.SENDGRID_API_KEY:
raise ValueError("SENDGRID_API_KEY is required for SendGrid mail type")
self._client = SendGridClient(
sendgrid_api_key=dify_config.SENDGRID_API_KEY, _from=dify_config.MAIL_DEFAULT_SEND_FROM or ""
)
case _:
raise ValueError("Unsupported mail type {}".format(mail_type))

@ -19,7 +19,6 @@ workflow_run_for_log_fields = {
workflow_run_for_list_fields = {
"id": fields.String,
"sequence_number": fields.Integer,
"version": fields.String,
"status": fields.String,
"elapsed_time": fields.Float,
@ -36,7 +35,6 @@ advanced_chat_workflow_run_for_list_fields = {
"id": fields.String,
"conversation_id": fields.String,
"message_id": fields.String,
"sequence_number": fields.Integer,
"version": fields.String,
"status": fields.String,
"elapsed_time": fields.Float,
@ -63,7 +61,6 @@ workflow_run_pagination_fields = {
workflow_run_detail_fields = {
"id": fields.String,
"sequence_number": fields.Integer,
"version": fields.String,
"graph": fields.Raw(attribute="graph_dict"),
"inputs": fields.Raw(attribute="inputs_dict"),

@ -0,0 +1,65 @@
import contextvars
from collections.abc import Iterator
from contextlib import contextmanager
from typing import TypeVar
from flask import Flask, g, has_request_context
T = TypeVar("T")
@contextmanager
def preserve_flask_contexts(
flask_app: Flask,
context_vars: contextvars.Context,
) -> Iterator[None]:
"""
A context manager that handles:
1. flask-login's UserProxy copy
2. ContextVars copy
3. flask_app.app_context()
This context manager ensures that the Flask application context is properly set up,
the current user is preserved across context boundaries, and any provided context variables
are set within the new context.
Note:
This manager aims to allow use current_user cross thread and app context,
but it's not the recommend use, it's better to pass user directly in parameters.
Args:
flask_app: The Flask application instance
context_vars: contextvars.Context object containing context variables to be set in the new context
Yields:
None
Example:
```python
with preserve_flask_contexts(flask_app, context_vars=context_vars):
# Code that needs Flask app context and context variables
# Current user will be preserved if available
```
"""
# Set context variables if provided
if context_vars:
for var, val in context_vars.items():
var.set(val)
# Save current user before entering new app context
saved_user = None
if has_request_context() and hasattr(g, "_login_user"):
saved_user = g._login_user
# Enter Flask app context
with flask_app.app_context():
try:
# Restore user in new app context if it was saved
if saved_user is not None:
g._login_user = saved_user
# Yield control back to the caller
yield
finally:
# Any cleanup can be added here if needed
pass

@ -0,0 +1,45 @@
import logging
import sendgrid # type: ignore
from python_http_client.exceptions import ForbiddenError, UnauthorizedError
from sendgrid.helpers.mail import Content, Email, Mail, To # type: ignore
class SendGridClient:
def __init__(self, sendgrid_api_key: str, _from: str):
self.sendgrid_api_key = sendgrid_api_key
self._from = _from
def send(self, mail: dict):
logging.debug("Sending email with SendGrid")
try:
_to = mail["to"]
if not _to:
raise ValueError("SendGridClient: Cannot send email: recipient address is missing.")
sg = sendgrid.SendGridAPIClient(api_key=self.sendgrid_api_key)
from_email = Email(self._from)
to_email = To(_to)
subject = mail["subject"]
content = Content("text/html", mail["html"])
mail = Mail(from_email, to_email, subject, content)
mail_json = mail.get() # type: ignore
response = sg.client.mail.send.post(request_body=mail_json)
logging.debug(response.status_code)
logging.debug(response.body)
logging.debug(response.headers)
except TimeoutError as e:
logging.exception("SendGridClient Timeout occurred while sending email")
raise
except (UnauthorizedError, ForbiddenError) as e:
logging.exception(
"SendGridClient Authentication failed. "
"Verify that your credentials and the 'from' email address are correct"
)
raise
except Exception as e:
logging.exception(f"SendGridClient Unexpected error occurred while sending email to {_to}")
raise

@ -0,0 +1,66 @@
"""remove sequence_number from workflow_runs
Revision ID: 0ab65e1cc7fa
Revises: 4474872b0ee6
Create Date: 2025-06-19 16:33:13.377215
"""
from alembic import op
import models as models
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision = '0ab65e1cc7fa'
down_revision = '4474872b0ee6'
branch_labels = None
depends_on = None
def upgrade():
# ### commands auto generated by Alembic - please adjust! ###
with op.batch_alter_table('workflow_runs', schema=None) as batch_op:
batch_op.drop_index(batch_op.f('workflow_run_tenant_app_sequence_idx'))
batch_op.drop_column('sequence_number')
# ### end Alembic commands ###
def downgrade():
# ### commands auto generated by Alembic - please adjust! ###
# WARNING: This downgrade CANNOT recover the original sequence_number values!
# The original sequence numbers are permanently lost after the upgrade.
# This downgrade will regenerate sequence numbers based on created_at order,
# which may result in different values than the original sequence numbers.
#
# If you need to preserve original sequence numbers, use the alternative
# migration approach that creates a backup table before removal.
# Step 1: Add sequence_number column as nullable first
with op.batch_alter_table('workflow_runs', schema=None) as batch_op:
batch_op.add_column(sa.Column('sequence_number', sa.INTEGER(), autoincrement=False, nullable=True))
# Step 2: Populate sequence_number values based on created_at order within each app
# NOTE: This recreates sequence numbering logic but values will be different
# from the original sequence numbers that were removed in the upgrade
connection = op.get_bind()
connection.execute(sa.text("""
UPDATE workflow_runs
SET sequence_number = subquery.row_num
FROM (
SELECT id, ROW_NUMBER() OVER (
PARTITION BY tenant_id, app_id
ORDER BY created_at, id
) as row_num
FROM workflow_runs
) subquery
WHERE workflow_runs.id = subquery.id
"""))
# Step 3: Make the column NOT NULL and add the index
with op.batch_alter_table('workflow_runs', schema=None) as batch_op:
batch_op.alter_column('sequence_number', nullable=False)
batch_op.create_index(batch_op.f('workflow_run_tenant_app_sequence_idx'), ['tenant_id', 'app_id', 'sequence_number'], unique=False)
# ### end Alembic commands ###

@ -10,7 +10,6 @@ from core.plugin.entities.plugin import GenericProviderID
from core.tools.entities.tool_entities import ToolProviderType
from core.tools.signature import sign_tool_file
from core.workflow.entities.workflow_execution import WorkflowExecutionStatus
from services.plugin.plugin_service import PluginService
if TYPE_CHECKING:
from models.workflow import Workflow
@ -169,6 +168,7 @@ class App(Base):
@property
def deleted_tools(self) -> list:
from core.tools.tool_manager import ToolManager
from services.plugin.plugin_service import PluginService
# get agent mode tools
app_model_config = self.app_model_config

@ -386,7 +386,7 @@ class WorkflowRun(Base):
- id (uuid) Run ID
- tenant_id (uuid) Workspace ID
- app_id (uuid) App ID
- sequence_number (int) Auto-increment sequence number, incremented within the App, starting from 1
- workflow_id (uuid) Workflow ID
- type (string) Workflow type
- triggered_from (string) Trigger source
@ -419,13 +419,12 @@ class WorkflowRun(Base):
__table_args__ = (
db.PrimaryKeyConstraint("id", name="workflow_run_pkey"),
db.Index("workflow_run_triggerd_from_idx", "tenant_id", "app_id", "triggered_from"),
db.Index("workflow_run_tenant_app_sequence_idx", "tenant_id", "app_id", "sequence_number"),
)
id: Mapped[str] = mapped_column(StringUUID, server_default=db.text("uuid_generate_v4()"))
tenant_id: Mapped[str] = mapped_column(StringUUID)
app_id: Mapped[str] = mapped_column(StringUUID)
sequence_number: Mapped[int] = mapped_column()
workflow_id: Mapped[str] = mapped_column(StringUUID)
type: Mapped[str] = mapped_column(db.String(255))
triggered_from: Mapped[str] = mapped_column(db.String(255))
@ -485,7 +484,6 @@ class WorkflowRun(Base):
"id": self.id,
"tenant_id": self.tenant_id,
"app_id": self.app_id,
"sequence_number": self.sequence_number,
"workflow_id": self.workflow_id,
"type": self.type,
"triggered_from": self.triggered_from,
@ -511,7 +509,6 @@ class WorkflowRun(Base):
id=data.get("id"),
tenant_id=data.get("tenant_id"),
app_id=data.get("app_id"),
sequence_number=data.get("sequence_number"),
workflow_id=data.get("workflow_id"),
type=data.get("type"),
triggered_from=data.get("triggered_from"),

@ -18,4 +18,3 @@ ignore_missing_imports=True
[mypy-flask_restful.inputs]
ignore_missing_imports=True

@ -81,6 +81,7 @@ dependencies = [
"weave~=0.51.0",
"yarl~=1.18.3",
"webvtt-py~=0.5.1",
"sendgrid~=6.12.3",
]
# Before adding new dependency, consider place it in
# alphabet order (a-z) and suitable group.
@ -202,4 +203,5 @@ vdb = [
"volcengine-compat~=1.0.0",
"weaviate-client~=3.24.0",
"xinference-client~=1.2.2",
"mo-vector~=0.1.13",
]

@ -421,7 +421,7 @@ class AppDslService:
# Set icon type
icon_type_value = icon_type or app_data.get("icon_type")
if icon_type_value in ["emoji", "link"]:
if icon_type_value in ["emoji", "link", "image"]:
icon_type = icon_type_value
else:
icon_type = "emoji"

@ -59,6 +59,7 @@ from services.external_knowledge_service import ExternalDatasetService
from services.feature_service import FeatureModel, FeatureService
from services.tag_service import TagService
from services.vector_service import VectorService
from tasks.add_document_to_index_task import add_document_to_index_task
from tasks.batch_clean_document_task import batch_clean_document_task
from tasks.clean_notion_document_task import clean_notion_document_task
from tasks.deal_dataset_vector_index_task import deal_dataset_vector_index_task
@ -70,6 +71,7 @@ from tasks.document_indexing_update_task import document_indexing_update_task
from tasks.duplicate_document_indexing_task import duplicate_document_indexing_task
from tasks.enable_segments_to_index_task import enable_segments_to_index_task
from tasks.recover_document_indexing_task import recover_document_indexing_task
from tasks.remove_document_from_index_task import remove_document_from_index_task
from tasks.retry_document_indexing_task import retry_document_indexing_task
from tasks.sync_website_document_indexing_task import sync_website_document_indexing_task
@ -434,7 +436,7 @@ class DatasetService:
raise ValueError(ex.description)
filtered_data["updated_by"] = user.id
filtered_data["updated_at"] = datetime.datetime.now()
filtered_data["updated_at"] = datetime.datetime.now(datetime.UTC).replace(tzinfo=None)
# update Retrieval model
filtered_data["retrieval_model"] = data["retrieval_model"]
@ -976,12 +978,17 @@ class DocumentService:
process_rule = knowledge_config.process_rule
if process_rule:
if process_rule.mode in ("custom", "hierarchical"):
if process_rule.rules:
dataset_process_rule = DatasetProcessRule(
dataset_id=dataset.id,
mode=process_rule.mode,
rules=process_rule.rules.model_dump_json() if process_rule.rules else None,
created_by=account.id,
)
else:
dataset_process_rule = dataset.latest_process_rule
if not dataset_process_rule:
raise ValueError("No process rule found.")
elif process_rule.mode == "automatic":
dataset_process_rule = DatasetProcessRule(
dataset_id=dataset.id,
@ -1603,6 +1610,99 @@ class DocumentService:
if not isinstance(args["process_rule"]["rules"]["segmentation"]["max_tokens"], int):
raise ValueError("Process rule segmentation max_tokens is invalid")
@staticmethod
def batch_update_document_status(dataset: Dataset, document_ids: list[str], action: str, user):
"""
Batch update document status.
Args:
dataset (Dataset): The dataset object
document_ids (list[str]): List of document IDs to update
action (str): Action to perform (enable, disable, archive, un_archive)
user: Current user performing the action
Raises:
DocumentIndexingError: If document is being indexed or not in correct state
"""
if not document_ids:
return
for document_id in document_ids:
document = DocumentService.get_document(dataset.id, document_id)
if not document:
continue
indexing_cache_key = f"document_{document.id}_indexing"
cache_result = redis_client.get(indexing_cache_key)
if cache_result is not None:
raise DocumentIndexingError(f"Document:{document.name} is being indexed, please try again later")
if action == "enable":
if document.enabled:
continue
document.enabled = True
document.disabled_at = None
document.disabled_by = None
document.updated_at = datetime.datetime.now(datetime.UTC).replace(tzinfo=None)
db.session.commit()
# Set cache to prevent indexing the same document multiple times
redis_client.setex(indexing_cache_key, 600, 1)
add_document_to_index_task.delay(document_id)
elif action == "disable":
if not document.completed_at or document.indexing_status != "completed":
raise DocumentIndexingError(f"Document: {document.name} is not completed.")
if not document.enabled:
continue
document.enabled = False
document.disabled_at = datetime.datetime.now(datetime.UTC).replace(tzinfo=None)
document.disabled_by = user.id
document.updated_at = datetime.datetime.now(datetime.UTC).replace(tzinfo=None)
db.session.commit()
# Set cache to prevent indexing the same document multiple times
redis_client.setex(indexing_cache_key, 600, 1)
remove_document_from_index_task.delay(document_id)
elif action == "archive":
if document.archived:
continue
document.archived = True
document.archived_at = datetime.datetime.now(datetime.UTC).replace(tzinfo=None)
document.archived_by = user.id
document.updated_at = datetime.datetime.now(datetime.UTC).replace(tzinfo=None)
db.session.commit()
if document.enabled:
# Set cache to prevent indexing the same document multiple times
redis_client.setex(indexing_cache_key, 600, 1)
remove_document_from_index_task.delay(document_id)
elif action == "un_archive":
if not document.archived:
continue
document.archived = False
document.archived_at = None
document.archived_by = None
document.updated_at = datetime.datetime.now(datetime.UTC).replace(tzinfo=None)
db.session.commit()
# Only re-index if the document is currently enabled
if document.enabled:
# Set cache to prevent indexing the same document multiple times
redis_client.setex(indexing_cache_key, 600, 1)
add_document_to_index_task.delay(document_id)
else:
raise ValueError(f"Invalid action: {action}")
class SegmentService:
@classmethod

@ -101,7 +101,7 @@ class WeightModel(BaseModel):
class RetrievalModel(BaseModel):
search_method: Literal["hybrid_search", "semantic_search", "full_text_search"]
search_method: Literal["hybrid_search", "semantic_search", "full_text_search", "keyword_search"]
reranking_enable: bool
reranking_model: Optional[RerankingModel] = None
reranking_mode: Optional[str] = None

@ -0,0 +1,5 @@
from services.errors.base import BaseServiceError
class PluginInstallationForbiddenError(BaseServiceError):
pass

@ -88,6 +88,26 @@ class WebAppAuthModel(BaseModel):
allow_email_password_login: bool = False
class PluginInstallationScope(StrEnum):
NONE = "none"
OFFICIAL_ONLY = "official_only"
OFFICIAL_AND_SPECIFIC_PARTNERS = "official_and_specific_partners"
ALL = "all"
class PluginInstallationPermissionModel(BaseModel):
# Plugin installation scope possible values:
# none: prohibit all plugin installations
# official_only: allow only Dify official plugins
# official_and_specific_partners: allow official and specific partner plugins
# all: allow installation of all plugins
plugin_installation_scope: PluginInstallationScope = PluginInstallationScope.ALL
# If True, restrict plugin installation to the marketplace only
# Equivalent to ForceEnablePluginVerification
restrict_to_marketplace_only: bool = False
class FeatureModel(BaseModel):
billing: BillingModel = BillingModel()
education: EducationModel = EducationModel()
@ -128,6 +148,7 @@ class SystemFeatureModel(BaseModel):
license: LicenseModel = LicenseModel()
branding: BrandingModel = BrandingModel()
webapp_auth: WebAppAuthModel = WebAppAuthModel()
plugin_installation_permission: PluginInstallationPermissionModel = PluginInstallationPermissionModel()
class FeatureService:
@ -291,3 +312,12 @@ class FeatureService:
features.license.workspaces.enabled = license_info["workspaces"]["enabled"]
features.license.workspaces.limit = license_info["workspaces"]["limit"]
features.license.workspaces.size = license_info["workspaces"]["used"]
if "PluginInstallationPermission" in enterprise_info:
plugin_installation_info = enterprise_info["PluginInstallationPermission"]
features.plugin_installation_permission.plugin_installation_scope = plugin_installation_info[
"pluginInstallationScope"
]
features.plugin_installation_permission.restrict_to_marketplace_only = plugin_installation_info[
"restrictToMarketplaceOnly"
]

@ -3,7 +3,7 @@ import logging
import click
from core.entities import DEFAULT_PLUGIN_ID
from core.plugin.entities.plugin import GenericProviderID, ModelProviderID, ToolProviderID
from models.engine import db
logger = logging.getLogger(__name__)
@ -12,17 +12,17 @@ logger = logging.getLogger(__name__)
class PluginDataMigration:
@classmethod
def migrate(cls) -> None:
cls.migrate_db_records("providers", "provider_name") # large table
cls.migrate_db_records("provider_models", "provider_name")
cls.migrate_db_records("provider_orders", "provider_name")
cls.migrate_db_records("tenant_default_models", "provider_name")
cls.migrate_db_records("tenant_preferred_model_providers", "provider_name")
cls.migrate_db_records("provider_model_settings", "provider_name")
cls.migrate_db_records("load_balancing_model_configs", "provider_name")
cls.migrate_db_records("providers", "provider_name", ModelProviderID) # large table
cls.migrate_db_records("provider_models", "provider_name", ModelProviderID)
cls.migrate_db_records("provider_orders", "provider_name", ModelProviderID)
cls.migrate_db_records("tenant_default_models", "provider_name", ModelProviderID)
cls.migrate_db_records("tenant_preferred_model_providers", "provider_name", ModelProviderID)
cls.migrate_db_records("provider_model_settings", "provider_name", ModelProviderID)
cls.migrate_db_records("load_balancing_model_configs", "provider_name", ModelProviderID)
cls.migrate_datasets()
cls.migrate_db_records("embeddings", "provider_name") # large table
cls.migrate_db_records("dataset_collection_bindings", "provider_name")
cls.migrate_db_records("tool_builtin_providers", "provider")
cls.migrate_db_records("embeddings", "provider_name", ModelProviderID) # large table
cls.migrate_db_records("dataset_collection_bindings", "provider_name", ModelProviderID)
cls.migrate_db_records("tool_builtin_providers", "provider_name", ToolProviderID)
@classmethod
def migrate_datasets(cls) -> None:
@ -66,9 +66,10 @@ limit 1000"""
fg="white",
)
)
retrieval_model["reranking_model"]["reranking_provider_name"] = (
f"{DEFAULT_PLUGIN_ID}/{retrieval_model['reranking_model']['reranking_provider_name']}/{retrieval_model['reranking_model']['reranking_provider_name']}"
)
# update google to langgenius/gemini/google etc.
retrieval_model["reranking_model"]["reranking_provider_name"] = ModelProviderID(
retrieval_model["reranking_model"]["reranking_provider_name"]
).to_string()
retrieval_model_changed = True
click.echo(
@ -86,9 +87,11 @@ limit 1000"""
update_retrieval_model_sql = ", retrieval_model = :retrieval_model"
params["retrieval_model"] = json.dumps(retrieval_model)
params["provider_name"] = ModelProviderID(provider_name).to_string()
sql = f"""update {table_name}
set {provider_column_name} =
concat('{DEFAULT_PLUGIN_ID}/', {provider_column_name}, '/', {provider_column_name})
:provider_name
{update_retrieval_model_sql}
where id = :record_id"""
conn.execute(db.text(sql), params)
@ -122,7 +125,9 @@ limit 1000"""
)
@classmethod
def migrate_db_records(cls, table_name: str, provider_column_name: str) -> None:
def migrate_db_records(
cls, table_name: str, provider_column_name: str, provider_cls: type[GenericProviderID]
) -> None:
click.echo(click.style(f"Migrating [{table_name}] data for plugin", fg="white"))
processed_count = 0
@ -166,7 +171,8 @@ limit 1000"""
)
try:
updated_value = f"{DEFAULT_PLUGIN_ID}/{provider_name}/{provider_name}"
# update jina to langgenius/jina_tool/jina etc.
updated_value = provider_cls(provider_name).to_string()
batch_updates.append((updated_value, record_id))
except Exception as e:
failed_ids.append(record_id)

@ -17,11 +17,18 @@ from core.plugin.entities.plugin import (
PluginInstallation,
PluginInstallationSource,
)
from core.plugin.entities.plugin_daemon import PluginInstallTask, PluginListResponse, PluginUploadResponse
from core.plugin.entities.plugin_daemon import (
PluginDecodeResponse,
PluginInstallTask,
PluginListResponse,
PluginVerification,
)
from core.plugin.impl.asset import PluginAssetManager
from core.plugin.impl.debugging import PluginDebuggingClient
from core.plugin.impl.plugin import PluginInstaller
from extensions.ext_redis import redis_client
from services.errors.plugin import PluginInstallationForbiddenError
from services.feature_service import FeatureService, PluginInstallationScope
logger = logging.getLogger(__name__)
@ -86,6 +93,42 @@ class PluginService:
logger.exception("failed to fetch latest plugin version")
return result
@staticmethod
def _check_marketplace_only_permission():
"""
Check if the marketplace only permission is enabled
"""
features = FeatureService.get_system_features()
if features.plugin_installation_permission.restrict_to_marketplace_only:
raise PluginInstallationForbiddenError("Plugin installation is restricted to marketplace only")
@staticmethod
def _check_plugin_installation_scope(plugin_verification: Optional[PluginVerification]):
"""
Check the plugin installation scope
"""
features = FeatureService.get_system_features()
match features.plugin_installation_permission.plugin_installation_scope:
case PluginInstallationScope.OFFICIAL_ONLY:
if (
plugin_verification is None
or plugin_verification.authorized_category != PluginVerification.AuthorizedCategory.Langgenius
):
raise PluginInstallationForbiddenError("Plugin installation is restricted to official only")
case PluginInstallationScope.OFFICIAL_AND_SPECIFIC_PARTNERS:
if plugin_verification is None or plugin_verification.authorized_category not in [
PluginVerification.AuthorizedCategory.Langgenius,
PluginVerification.AuthorizedCategory.Partner,
]:
raise PluginInstallationForbiddenError(
"Plugin installation is restricted to official and specific partners"
)
case PluginInstallationScope.NONE:
raise PluginInstallationForbiddenError("Installing plugins is not allowed")
case PluginInstallationScope.ALL:
pass
@staticmethod
def get_debugging_key(tenant_id: str) -> str:
"""
@ -208,6 +251,8 @@ class PluginService:
# check if plugin pkg is already downloaded
manager = PluginInstaller()
features = FeatureService.get_system_features()
try:
manager.fetch_plugin_manifest(tenant_id, new_plugin_unique_identifier)
# already downloaded, skip, and record install event
@ -215,7 +260,14 @@ class PluginService:
except Exception:
# plugin not installed, download and upload pkg
pkg = download_plugin_pkg(new_plugin_unique_identifier)
manager.upload_pkg(tenant_id, pkg, verify_signature=False)
response = manager.upload_pkg(
tenant_id,
pkg,
verify_signature=features.plugin_installation_permission.restrict_to_marketplace_only,
)
# check if the plugin is available to install
PluginService._check_plugin_installation_scope(response.verification)
return manager.upgrade_plugin(
tenant_id,
@ -239,6 +291,7 @@ class PluginService:
"""
Upgrade plugin with github
"""
PluginService._check_marketplace_only_permission()
manager = PluginInstaller()
return manager.upgrade_plugin(
tenant_id,
@ -253,33 +306,43 @@ class PluginService:
)
@staticmethod
def upload_pkg(tenant_id: str, pkg: bytes, verify_signature: bool = False) -> PluginUploadResponse:
def upload_pkg(tenant_id: str, pkg: bytes, verify_signature: bool = False) -> PluginDecodeResponse:
"""
Upload plugin package files
returns: plugin_unique_identifier
"""
PluginService._check_marketplace_only_permission()
manager = PluginInstaller()
return manager.upload_pkg(tenant_id, pkg, verify_signature)
features = FeatureService.get_system_features()
response = manager.upload_pkg(
tenant_id,
pkg,
verify_signature=features.plugin_installation_permission.restrict_to_marketplace_only,
)
return response
@staticmethod
def upload_pkg_from_github(
tenant_id: str, repo: str, version: str, package: str, verify_signature: bool = False
) -> PluginUploadResponse:
) -> PluginDecodeResponse:
"""
Install plugin from github release package files,
returns plugin_unique_identifier
"""
PluginService._check_marketplace_only_permission()
pkg = download_with_size_limit(
f"https://github.com/{repo}/releases/download/{version}/{package}", dify_config.PLUGIN_MAX_PACKAGE_SIZE
)
features = FeatureService.get_system_features()
manager = PluginInstaller()
return manager.upload_pkg(
response = manager.upload_pkg(
tenant_id,
pkg,
verify_signature,
verify_signature=features.plugin_installation_permission.restrict_to_marketplace_only,
)
return response
@staticmethod
def upload_bundle(
@ -289,11 +352,15 @@ class PluginService:
Upload a plugin bundle and return the dependencies.
"""
manager = PluginInstaller()
PluginService._check_marketplace_only_permission()
return manager.upload_bundle(tenant_id, bundle, verify_signature)
@staticmethod
def install_from_local_pkg(tenant_id: str, plugin_unique_identifiers: Sequence[str]):
PluginService._check_marketplace_only_permission()
manager = PluginInstaller()
return manager.install_from_identifiers(
tenant_id,
plugin_unique_identifiers,
@ -307,6 +374,8 @@ class PluginService:
Install plugin from github release package files,
returns plugin_unique_identifier
"""
PluginService._check_marketplace_only_permission()
manager = PluginInstaller()
return manager.install_from_identifiers(
tenant_id,
@ -322,28 +391,33 @@ class PluginService:
)
@staticmethod
def fetch_marketplace_pkg(
tenant_id: str, plugin_unique_identifier: str, verify_signature: bool = False
) -> PluginDeclaration:
def fetch_marketplace_pkg(tenant_id: str, plugin_unique_identifier: str) -> PluginDeclaration:
"""
Fetch marketplace package
"""
if not dify_config.MARKETPLACE_ENABLED:
raise ValueError("marketplace is not enabled")
features = FeatureService.get_system_features()
manager = PluginInstaller()
try:
declaration = manager.fetch_plugin_manifest(tenant_id, plugin_unique_identifier)
except Exception:
pkg = download_plugin_pkg(plugin_unique_identifier)
declaration = manager.upload_pkg(tenant_id, pkg, verify_signature).manifest
response = manager.upload_pkg(
tenant_id,
pkg,
verify_signature=features.plugin_installation_permission.restrict_to_marketplace_only,
)
# check if the plugin is available to install
PluginService._check_plugin_installation_scope(response.verification)
declaration = response.manifest
return declaration
@staticmethod
def install_from_marketplace_pkg(
tenant_id: str, plugin_unique_identifiers: Sequence[str], verify_signature: bool = False
):
def install_from_marketplace_pkg(tenant_id: str, plugin_unique_identifiers: Sequence[str]):
"""
Install plugin from marketplace package files,
returns installation task id
@ -353,15 +427,26 @@ class PluginService:
manager = PluginInstaller()
features = FeatureService.get_system_features()
# check if already downloaded
for plugin_unique_identifier in plugin_unique_identifiers:
try:
manager.fetch_plugin_manifest(tenant_id, plugin_unique_identifier)
plugin_decode_response = manager.decode_plugin_from_identifier(tenant_id, plugin_unique_identifier)
# check if the plugin is available to install
PluginService._check_plugin_installation_scope(plugin_decode_response.verification)
# already downloaded, skip
except Exception:
# plugin not installed, download and upload pkg
pkg = download_plugin_pkg(plugin_unique_identifier)
manager.upload_pkg(tenant_id, pkg, verify_signature)
response = manager.upload_pkg(
tenant_id,
pkg,
verify_signature=features.plugin_installation_permission.restrict_to_marketplace_only,
)
# check if the plugin is available to install
PluginService._check_plugin_installation_scope(response.verification)
return manager.install_from_identifiers(
tenant_id,

@ -5,7 +5,7 @@ from sqlalchemy import and_, func, or_, select
from sqlalchemy.orm import Session
from core.workflow.entities.workflow_execution import WorkflowExecutionStatus
from models import App, EndUser, WorkflowAppLog, WorkflowRun
from models import Account, App, EndUser, WorkflowAppLog, WorkflowRun
from models.enums import CreatorUserRole
@ -21,6 +21,8 @@ class WorkflowAppService:
created_at_after: datetime | None = None,
page: int = 1,
limit: int = 20,
created_by_end_user_session_id: str | None = None,
created_by_account: str | None = None,
) -> dict:
"""
Get paginate workflow app logs using SQLAlchemy 2.0 style
@ -32,6 +34,8 @@ class WorkflowAppService:
:param created_at_after: filter logs created after this timestamp
:param page: page number
:param limit: items per page
:param created_by_end_user_session_id: filter by end user session id
:param created_by_account: filter by account email
:return: Pagination object
"""
# Build base statement using SQLAlchemy 2.0 style
@ -71,6 +75,26 @@ class WorkflowAppService:
if created_at_after:
stmt = stmt.where(WorkflowAppLog.created_at >= created_at_after)
# Filter by end user session id or account email
if created_by_end_user_session_id:
stmt = stmt.join(
EndUser,
and_(
WorkflowAppLog.created_by == EndUser.id,
WorkflowAppLog.created_by_role == CreatorUserRole.END_USER,
EndUser.session_id == created_by_end_user_session_id,
),
)
if created_by_account:
stmt = stmt.join(
Account,
and_(
WorkflowAppLog.created_by == Account.id,
WorkflowAppLog.created_by_role == CreatorUserRole.ACCOUNT,
Account.email == created_by_account,
),
)
stmt = stmt.order_by(WorkflowAppLog.created_at.desc())
# Get total count using the same filters

@ -0,0 +1,25 @@
from core.rag.datasource.vdb.matrixone.matrixone_vector import MatrixoneConfig, MatrixoneVector
from tests.integration_tests.vdb.test_vector_store import (
AbstractVectorTest,
get_example_text,
setup_mock_redis,
)
class MatrixoneVectorTest(AbstractVectorTest):
def __init__(self):
super().__init__()
self.vector = MatrixoneVector(
collection_name=self.collection_name,
config=MatrixoneConfig(
host="localhost", port=6001, user="dump", password="111", database="dify", metric="l2"
),
)
def get_ids_by_metadata_field(self):
ids = self.vector.get_ids_by_metadata_field(key="document_id", value=self.example_doc_id)
assert len(ids) == 1
def test_matrixone_vector(setup_mock_redis):
MatrixoneVectorTest().run_all_tests()

@ -1,49 +0,0 @@
import time
import pymysql
def check_oceanbase_ready() -> bool:
try:
connection = pymysql.connect(
host="localhost",
port=2881,
user="root",
password="difyai123456",
)
affected_rows = connection.query("SELECT 1")
return affected_rows == 1
except Exception as e:
print(f"Oceanbase is not ready. Exception: {e}")
return False
finally:
if connection:
connection.close()
def main():
max_attempts = 50
retry_interval_seconds = 2
is_oceanbase_ready = False
for attempt in range(max_attempts):
try:
is_oceanbase_ready = check_oceanbase_ready()
except Exception as e:
print(f"Oceanbase is not ready. Exception: {e}")
is_oceanbase_ready = False
if is_oceanbase_ready:
break
else:
print(f"Attempt {attempt + 1} failed, retry in {retry_interval_seconds} seconds...")
time.sleep(retry_interval_seconds)
if is_oceanbase_ready:
print("Oceanbase is ready.")
else:
print(f"Oceanbase is not ready after {max_attempts} attempting checks.")
exit(1)
if __name__ == "__main__":
main()

@ -163,7 +163,6 @@ def real_workflow_run():
workflow_run.tenant_id = "test-tenant-id"
workflow_run.app_id = "test-app-id"
workflow_run.workflow_id = "test-workflow-id"
workflow_run.sequence_number = 1
workflow_run.type = "chat"
workflow_run.triggered_from = "app-run"
workflow_run.version = "1.0"

@ -0,0 +1,124 @@
import contextvars
import threading
from typing import Optional
import pytest
from flask import Flask
from flask_login import LoginManager, UserMixin, current_user, login_user
from libs.flask_utils import preserve_flask_contexts
class User(UserMixin):
"""Simple User class for testing."""
def __init__(self, id: str):
self.id = id
def get_id(self) -> str:
return self.id
@pytest.fixture
def login_app(app: Flask) -> Flask:
"""Set up a Flask app with flask-login."""
# Set a secret key for the app
app.config["SECRET_KEY"] = "test-secret-key"
login_manager = LoginManager()
login_manager.init_app(app)
@login_manager.user_loader
def load_user(user_id: str) -> Optional[User]:
if user_id == "test_user":
return User("test_user")
return None
return app
@pytest.fixture
def test_user() -> User:
"""Create a test user."""
return User("test_user")
def test_current_user_not_accessible_across_threads(login_app: Flask, test_user: User):
"""
Test that current_user is not accessible in a different thread without preserve_flask_contexts.
This test demonstrates that without the preserve_flask_contexts, we cannot access
current_user in a different thread, even with app_context.
"""
# Log in the user in the main thread
with login_app.test_request_context():
login_user(test_user)
assert current_user.is_authenticated
assert current_user.id == "test_user"
# Store the result of the thread execution
result = {"user_accessible": True, "error": None}
# Define a function to run in a separate thread
def check_user_in_thread():
try:
# Try to access current_user in a different thread with app_context
with login_app.app_context():
# This should fail because current_user is not accessible across threads
# without preserve_flask_contexts
result["user_accessible"] = current_user.is_authenticated
except Exception as e:
result["error"] = str(e) # type: ignore
# Run the function in a separate thread
thread = threading.Thread(target=check_user_in_thread)
thread.start()
thread.join()
# Verify that we got an error or current_user is not authenticated
assert result["error"] is not None or (result["user_accessible"] is not None and not result["user_accessible"])
def test_current_user_accessible_with_preserve_flask_contexts(login_app: Flask, test_user: User):
"""
Test that current_user is accessible in a different thread with preserve_flask_contexts.
This test demonstrates that with the preserve_flask_contexts, we can access
current_user in a different thread.
"""
# Log in the user in the main thread
with login_app.test_request_context():
login_user(test_user)
assert current_user.is_authenticated
assert current_user.id == "test_user"
# Save the context variables
context_vars = contextvars.copy_context()
# Store the result of the thread execution
result = {"user_accessible": False, "user_id": None, "error": None}
# Define a function to run in a separate thread
def check_user_in_thread_with_manager():
try:
# Use preserve_flask_contexts to access current_user in a different thread
with preserve_flask_contexts(login_app, context_vars):
from flask_login import current_user
if current_user:
result["user_accessible"] = True
result["user_id"] = current_user.id
else:
result["user_accessible"] = False
except Exception as e:
result["error"] = str(e) # type: ignore
# Run the function in a separate thread
thread = threading.Thread(target=check_user_in_thread_with_manager)
thread.start()
thread.join()
# Verify that current_user is accessible and has the correct ID
assert result["error"] is None
assert result["user_accessible"] is True
assert result["user_id"] == "test_user"

File diff suppressed because it is too large Load Diff

@ -399,7 +399,7 @@ SUPABASE_URL=your-server-url
# ------------------------------
# The type of vector store to use.
# Supported values are `weaviate`, `qdrant`, `milvus`, `myscale`, `relyt`, `pgvector`, `pgvecto-rs`, `chroma`, `opensearch`, `oracle`, `tencent`, `elasticsearch`, `elasticsearch-ja`, `analyticdb`, `couchbase`, `vikingdb`, `oceanbase`, `opengauss`, `tablestore`,`vastbase`,`tidb`,`tidb_on_qdrant`,`baidu`,`lindorm`,`huawei_cloud`,`upstash`.
# Supported values are `weaviate`, `qdrant`, `milvus`, `myscale`, `relyt`, `pgvector`, `pgvecto-rs`, `chroma`, `opensearch`, `oracle`, `tencent`, `elasticsearch`, `elasticsearch-ja`, `analyticdb`, `couchbase`, `vikingdb`, `oceanbase`, `opengauss`, `tablestore`,`vastbase`,`tidb`,`tidb_on_qdrant`,`baidu`,`lindorm`,`huawei_cloud`,`upstash`, `matrixone`.
VECTOR_STORE=weaviate
# The Weaviate endpoint URL. Only available when VECTOR_STORE is `weaviate`.
@ -490,6 +490,13 @@ TIDB_VECTOR_USER=
TIDB_VECTOR_PASSWORD=
TIDB_VECTOR_DATABASE=dify
# Matrixone vector configurations.
MATRIXONE_HOST=matrixone
MATRIXONE_PORT=6001
MATRIXONE_USER=dump
MATRIXONE_PASSWORD=111
MATRIXONE_DATABASE=dify
# Tidb on qdrant configuration, only available when VECTOR_STORE is `tidb_on_qdrant`
TIDB_ON_QDRANT_URL=http://127.0.0.1
TIDB_ON_QDRANT_API_KEY=dify
@ -719,10 +726,11 @@ NOTION_INTERNAL_SECRET=
# Mail related configuration
# ------------------------------
# Mail type, support: resend, smtp
# Mail type, support: resend, smtp, sendgrid
MAIL_TYPE=resend
# Default send from email address, if not specified
# If using SendGrid, use the 'from' field for authentication if necessary.
MAIL_DEFAULT_SEND_FROM=
# API-Key for the Resend email provider, used when MAIL_TYPE is `resend`.
@ -738,6 +746,9 @@ SMTP_PASSWORD=
SMTP_USE_TLS=true
SMTP_OPPORTUNISTIC_TLS=false
# Sendgid configuration
SENDGRID_API_KEY=
# ------------------------------
# Others Configuration
# ------------------------------
@ -815,7 +826,8 @@ TEXT_GENERATION_TIMEOUT_MS=60000
# Environment Variables for db Service
# ------------------------------
PGUSER=${DB_USERNAME}
# The name of the default postgres user.
POSTGRES_USER=${DB_USERNAME}
# The password for the default postgres user.
POSTGRES_PASSWORD=${DB_PASSWORD}
# The name of the default postgres database.
@ -1067,7 +1079,7 @@ PLUGIN_MEDIA_CACHE_PATH=assets
# Plugin oss bucket
PLUGIN_STORAGE_OSS_BUCKET=
# Plugin oss s3 credentials
PLUGIN_S3_USE_AWS=
PLUGIN_S3_USE_AWS=false
PLUGIN_S3_USE_AWS_MANAGED_IAM=false
PLUGIN_S3_ENDPOINT=
PLUGIN_S3_USE_PATH_STYLE=false

@ -84,7 +84,7 @@ services:
image: postgres:15-alpine
restart: always
environment:
PGUSER: ${PGUSER:-postgres}
POSTGRES_USER: ${POSTGRES_USER:-postgres}
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-difyai123456}
POSTGRES_DB: ${POSTGRES_DB:-dify}
PGDATA: ${PGDATA:-/var/lib/postgresql/data/pgdata}
@ -451,6 +451,14 @@ services:
OB_CLUSTER_NAME: ${OCEANBASE_CLUSTER_NAME:-difyai}
OB_SERVER_IP: 127.0.0.1
MODE: mini
ports:
- "${OCEANBASE_VECTOR_PORT:-2881}:2881"
healthcheck:
test: [ 'CMD-SHELL', 'obclient -h127.0.0.1 -P2881 -uroot@test -p$${OB_TENANT_PASSWORD} -e "SELECT 1;"' ]
interval: 10s
retries: 30
start_period: 30s
timeout: 10s
# Oracle vector database
oracle:
@ -609,6 +617,18 @@ services:
ports:
- ${MYSCALE_PORT:-8123}:${MYSCALE_PORT:-8123}
# Matrixone vector store.
matrixone:
hostname: matrixone
image: matrixorigin/matrixone:2.1.1
profiles:
- matrixone
restart: always
volumes:
- ./volumes/matrixone/data:/mo-data
ports:
- ${MATRIXONE_PORT:-6001}:${MATRIXONE_PORT:-6001}
# https://www.elastic.co/guide/en/elasticsearch/reference/current/settings.html
# https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html#docker-prod-prerequisites
elasticsearch:

@ -104,7 +104,7 @@ services:
PLUGIN_PACKAGE_CACHE_PATH: ${PLUGIN_PACKAGE_CACHE_PATH:-plugin_packages}
PLUGIN_MEDIA_CACHE_PATH: ${PLUGIN_MEDIA_CACHE_PATH:-assets}
PLUGIN_STORAGE_OSS_BUCKET: ${PLUGIN_STORAGE_OSS_BUCKET:-}
S3_USE_AWS: ${PLUGIN_S3_USE_AWS:-}
S3_USE_AWS: ${PLUGIN_S3_USE_AWS:-false}
S3_USE_AWS_MANAGED_IAM: ${PLUGIN_S3_USE_AWS_MANAGED_IAM:-false}
S3_ENDPOINT: ${PLUGIN_S3_ENDPOINT:-}
S3_USE_PATH_STYLE: ${PLUGIN_S3_USE_PATH_STYLE:-false}

@ -195,6 +195,11 @@ x-shared-env: &shared-api-worker-env
TIDB_VECTOR_USER: ${TIDB_VECTOR_USER:-}
TIDB_VECTOR_PASSWORD: ${TIDB_VECTOR_PASSWORD:-}
TIDB_VECTOR_DATABASE: ${TIDB_VECTOR_DATABASE:-dify}
MATRIXONE_HOST: ${MATRIXONE_HOST:-matrixone}
MATRIXONE_PORT: ${MATRIXONE_PORT:-6001}
MATRIXONE_USER: ${MATRIXONE_USER:-dump}
MATRIXONE_PASSWORD: ${MATRIXONE_PASSWORD:-111}
MATRIXONE_DATABASE: ${MATRIXONE_DATABASE:-dify}
TIDB_ON_QDRANT_URL: ${TIDB_ON_QDRANT_URL:-http://127.0.0.1}
TIDB_ON_QDRANT_API_KEY: ${TIDB_ON_QDRANT_API_KEY:-dify}
TIDB_ON_QDRANT_CLIENT_TIMEOUT: ${TIDB_ON_QDRANT_CLIENT_TIMEOUT:-20}
@ -322,6 +327,7 @@ x-shared-env: &shared-api-worker-env
SMTP_PASSWORD: ${SMTP_PASSWORD:-}
SMTP_USE_TLS: ${SMTP_USE_TLS:-true}
SMTP_OPPORTUNISTIC_TLS: ${SMTP_OPPORTUNISTIC_TLS:-false}
SENDGRID_API_KEY: ${SENDGRID_API_KEY:-}
INDEXING_MAX_SEGMENTATION_TOKENS_LENGTH: ${INDEXING_MAX_SEGMENTATION_TOKENS_LENGTH:-4000}
INVITE_EXPIRY_HOURS: ${INVITE_EXPIRY_HOURS:-72}
RESET_PASSWORD_TOKEN_EXPIRY_MINUTES: ${RESET_PASSWORD_TOKEN_EXPIRY_MINUTES:-5}
@ -356,7 +362,7 @@ x-shared-env: &shared-api-worker-env
MAX_PARALLEL_LIMIT: ${MAX_PARALLEL_LIMIT:-10}
MAX_ITERATIONS_NUM: ${MAX_ITERATIONS_NUM:-99}
TEXT_GENERATION_TIMEOUT_MS: ${TEXT_GENERATION_TIMEOUT_MS:-60000}
PGUSER: ${PGUSER:-${DB_USERNAME}}
POSTGRES_USER: ${POSTGRES_USER:-${DB_USERNAME}}
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-${DB_PASSWORD}}
POSTGRES_DB: ${POSTGRES_DB:-${DB_DATABASE}}
PGDATA: ${PGDATA:-/var/lib/postgresql/data/pgdata}
@ -467,7 +473,7 @@ x-shared-env: &shared-api-worker-env
PLUGIN_PACKAGE_CACHE_PATH: ${PLUGIN_PACKAGE_CACHE_PATH:-plugin_packages}
PLUGIN_MEDIA_CACHE_PATH: ${PLUGIN_MEDIA_CACHE_PATH:-assets}
PLUGIN_STORAGE_OSS_BUCKET: ${PLUGIN_STORAGE_OSS_BUCKET:-}
PLUGIN_S3_USE_AWS: ${PLUGIN_S3_USE_AWS:-}
PLUGIN_S3_USE_AWS: ${PLUGIN_S3_USE_AWS:-false}
PLUGIN_S3_USE_AWS_MANAGED_IAM: ${PLUGIN_S3_USE_AWS_MANAGED_IAM:-false}
PLUGIN_S3_ENDPOINT: ${PLUGIN_S3_ENDPOINT:-}
PLUGIN_S3_USE_PATH_STYLE: ${PLUGIN_S3_USE_PATH_STYLE:-false}
@ -591,7 +597,7 @@ services:
image: postgres:15-alpine
restart: always
environment:
PGUSER: ${PGUSER:-postgres}
POSTGRES_USER: ${POSTGRES_USER:-postgres}
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-difyai123456}
POSTGRES_DB: ${POSTGRES_DB:-dify}
PGDATA: ${PGDATA:-/var/lib/postgresql/data/pgdata}
@ -958,6 +964,14 @@ services:
OB_CLUSTER_NAME: ${OCEANBASE_CLUSTER_NAME:-difyai}
OB_SERVER_IP: 127.0.0.1
MODE: mini
ports:
- "${OCEANBASE_VECTOR_PORT:-2881}:2881"
healthcheck:
test: [ 'CMD-SHELL', 'obclient -h127.0.0.1 -P2881 -uroot@test -p$${OB_TENANT_PASSWORD} -e "SELECT 1;"' ]
interval: 10s
retries: 30
start_period: 30s
timeout: 10s
# Oracle vector database
oracle:
@ -1116,6 +1130,18 @@ services:
ports:
- ${MYSCALE_PORT:-8123}:${MYSCALE_PORT:-8123}
# Matrixone vector store.
matrixone:
hostname: matrixone
image: matrixorigin/matrixone:2.1.1
profiles:
- matrixone
restart: always
volumes:
- ./volumes/matrixone/data:/mo-data
ports:
- ${MATRIXONE_PORT:-6001}:${MATRIXONE_PORT:-6001}
# https://www.elastic.co/guide/en/elasticsearch/reference/current/settings.html
# https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html#docker-prod-prerequisites
elasticsearch:

@ -1,7 +1,7 @@
# ------------------------------
# Environment Variables for db Service
# ------------------------------
PGUSER=postgres
POSTGRES_USER=postgres
# The password for the default postgres user.
POSTGRES_PASSWORD=difyai123456
# The name of the default postgres database.
@ -133,7 +133,7 @@ PLUGIN_MEDIA_CACHE_PATH=assets
PLUGIN_STORAGE_OSS_BUCKET=
# Plugin oss s3 credentials
PLUGIN_S3_USE_AWS_MANAGED_IAM=false
PLUGIN_S3_USE_AWS=
PLUGIN_S3_USE_AWS=false
PLUGIN_S3_ENDPOINT=
PLUGIN_S3_USE_PATH_STYLE=false
PLUGIN_AWS_ACCESS_KEY=

@ -15,7 +15,7 @@ const Overview = async (props: IDevelopProps) => {
} = params
return (
<div className="h-full overflow-scroll bg-chatbot-bg px-4 py-6 sm:px-12">
<div className="h-full overflow-y-auto bg-chatbot-bg px-4 py-6 sm:px-12">
<ApikeyInfoPanel />
<ChartView
appId={appId}

@ -9,6 +9,7 @@ import { useTranslation } from 'react-i18next'
import { useDebounceFn } from 'ahooks'
import {
RiApps2Line,
RiDragDropLine,
RiExchange2Line,
RiFile4Line,
RiMessage3Line,
@ -16,7 +17,8 @@ import {
} from '@remixicon/react'
import AppCard from './AppCard'
import NewAppCard from './NewAppCard'
import useAppsQueryState from './hooks/useAppsQueryState'
import useAppsQueryState from './hooks/use-apps-query-state'
import { useDSLDragDrop } from './hooks/use-dsl-drag-drop'
import type { AppListResponse } from '@/models/app'
import { fetchAppList } from '@/service/apps'
import { useAppContext } from '@/context/app-context'
@ -29,6 +31,7 @@ import { useStore as useTagStore } from '@/app/components/base/tag-management/st
import TagManagementModal from '@/app/components/base/tag-management'
import TagFilter from '@/app/components/base/tag-management/filter'
import CheckboxWithLabel from '@/app/components/datasets/create/website/base/checkbox-with-label'
import CreateFromDSLModal from '@/app/components/app/create-from-dsl-modal'
const getKey = (
pageIndex: number,
@ -67,6 +70,9 @@ const Apps = () => {
const [tagFilterValue, setTagFilterValue] = useState<string[]>(tagIDs)
const [searchKeywords, setSearchKeywords] = useState(keywords)
const newAppCardRef = useRef<HTMLDivElement>(null)
const containerRef = useRef<HTMLDivElement>(null)
const [showCreateFromDSLModal, setShowCreateFromDSLModal] = useState(false)
const [droppedDSLFile, setDroppedDSLFile] = useState<File | undefined>()
const setKeywords = useCallback((keywords: string) => {
setQuery(prev => ({ ...prev, keywords }))
}, [setQuery])
@ -74,6 +80,17 @@ const Apps = () => {
setQuery(prev => ({ ...prev, tagIDs }))
}, [setQuery])
const handleDSLFileDropped = useCallback((file: File) => {
setDroppedDSLFile(file)
setShowCreateFromDSLModal(true)
}, [])
const { dragging } = useDSLDragDrop({
onDSLFileDropped: handleDSLFileDropped,
containerRef,
enabled: isCurrentWorkspaceEditor,
})
const { data, isLoading, error, setSize, mutate } = useSWRInfinite(
(pageIndex: number, previousPageData: AppListResponse) => getKey(pageIndex, previousPageData, activeTab, isCreatedByMe, tagIDs, searchKeywords),
fetchAppList,
@ -151,6 +168,12 @@ const Apps = () => {
return (
<>
<div ref={containerRef} className='relative flex h-0 shrink-0 grow flex-col overflow-y-auto bg-background-body'>
{dragging && (
<div className="absolute inset-0 z-50 m-0.5 rounded-2xl border-2 border-dashed border-components-dropzone-border-accent bg-[rgba(21,90,239,0.14)] p-2">
</div>
)}
<div className='sticky top-0 z-10 flex flex-wrap items-center justify-between gap-y-2 bg-background-body px-12 pb-2 pt-4 leading-[56px]'>
<TabSliderNew
value={activeTab}
@ -188,11 +211,39 @@ const Apps = () => {
&& <NewAppCard ref={newAppCardRef} className='z-10' onSuccess={mutate} />}
<NoAppsFound />
</div>}
{isCurrentWorkspaceEditor && (
<div
className={`flex items-center justify-center gap-2 py-4 ${dragging ? 'text-text-accent' : 'text-text-quaternary'}`}
role="region"
aria-label={t('app.newApp.dropDSLToCreateApp')}
>
<RiDragDropLine className="h-4 w-4" />
<span className="system-xs-regular">{t('app.newApp.dropDSLToCreateApp')}</span>
</div>
)}
<CheckModal />
<div ref={anchorRef} className='h-0'> </div>
{showTagManagementModal && (
<TagManagementModal type='app' show={showTagManagementModal} />
)}
</div>
{showCreateFromDSLModal && (
<CreateFromDSLModal
show={showCreateFromDSLModal}
onClose={() => {
setShowCreateFromDSLModal(false)
setDroppedDSLFile(undefined)
}}
onSuccess={() => {
setShowCreateFromDSLModal(false)
setDroppedDSLFile(undefined)
mutate()
}}
droppedFile={droppedDSLFile}
/>
)}
</>
)
}

@ -0,0 +1,72 @@
import { useEffect, useState } from 'react'
type DSLDragDropHookProps = {
onDSLFileDropped: (file: File) => void
containerRef: React.RefObject<HTMLDivElement>
enabled?: boolean
}
export const useDSLDragDrop = ({ onDSLFileDropped, containerRef, enabled = true }: DSLDragDropHookProps) => {
const [dragging, setDragging] = useState(false)
const handleDragEnter = (e: DragEvent) => {
e.preventDefault()
e.stopPropagation()
if (e.dataTransfer?.types.includes('Files'))
setDragging(true)
}
const handleDragOver = (e: DragEvent) => {
e.preventDefault()
e.stopPropagation()
}
const handleDragLeave = (e: DragEvent) => {
e.preventDefault()
e.stopPropagation()
if (e.relatedTarget === null || !containerRef.current?.contains(e.relatedTarget as Node))
setDragging(false)
}
const handleDrop = (e: DragEvent) => {
e.preventDefault()
e.stopPropagation()
setDragging(false)
if (!e.dataTransfer)
return
const files = [...e.dataTransfer.files]
if (files.length === 0)
return
const file = files[0]
if (file.name.toLowerCase().endsWith('.yaml') || file.name.toLowerCase().endsWith('.yml'))
onDSLFileDropped(file)
}
useEffect(() => {
if (!enabled)
return
const current = containerRef.current
if (current) {
current.addEventListener('dragenter', handleDragEnter)
current.addEventListener('dragover', handleDragOver)
current.addEventListener('dragleave', handleDragLeave)
current.addEventListener('drop', handleDrop)
}
return () => {
if (current) {
current.removeEventListener('dragenter', handleDragEnter)
current.removeEventListener('dragover', handleDragOver)
current.removeEventListener('dragleave', handleDragLeave)
current.removeEventListener('drop', handleDrop)
}
}
}, [containerRef, enabled])
return {
dragging: enabled ? dragging : false,
}
}

@ -81,7 +81,7 @@ const Datasets = ({
currentContainer?.removeEventListener('scroll', onScroll)
onScroll.cancel()
}
}, [onScroll])
}, [containerRef, onScroll])
return (
<nav className='grid shrink-0 grow grid-cols-1 content-start gap-4 px-12 pt-2 sm:grid-cols-2 md:grid-cols-3 lg:grid-cols-4'>

@ -5,34 +5,34 @@ import {
RiAddLine,
RiArrowRightLine,
} from '@remixicon/react'
import Link from 'next/link'
const CreateAppCard = (
{
ref,
..._
},
) => {
type CreateAppCardProps = {
ref?: React.Ref<HTMLAnchorElement>
}
const CreateAppCard = ({ ref }: CreateAppCardProps) => {
const { t } = useTranslation()
return (
<div className='bg-background-default-dimm flex min-h-[160px] flex-col rounded-xl border-[0.5px]
border-components-panel-border transition-all duration-200 ease-in-out'
>
<a ref={ref} className='group flex grow cursor-pointer items-start p-4' href={`${basePath}/datasets/create`}>
<Link ref={ref} className='group flex grow cursor-pointer items-start p-4' href={`${basePath}/datasets/create`}>
<div className='flex items-center gap-3'>
<div className='flex h-10 w-10 items-center justify-center rounded-lg border border-dashed border-divider-regular bg-background-default-lighter
p-2 group-hover:border-solid group-hover:border-effects-highlight group-hover:bg-background-default-dodge'
>
<RiAddLine className='h-4 w-4 text-text-tertiary group-hover:text-text-accent'/>
<RiAddLine className='h-4 w-4 text-text-tertiary group-hover:text-text-accent' />
</div>
<div className='system-md-semibold text-text-secondary group-hover:text-text-accent'>{t('dataset.createDataset')}</div>
</div>
</a>
</Link>
<div className='system-xs-regular p-4 pt-0 text-text-tertiary'>{t('dataset.createDatasetIntro')}</div>
<a className='group flex cursor-pointer items-center gap-1 rounded-b-xl border-t-[0.5px] border-divider-subtle p-4' href={`${basePath}/datasets/connect`}>
<Link className='group flex cursor-pointer items-center gap-1 rounded-b-xl border-t-[0.5px] border-divider-subtle p-4' href={`${basePath}/datasets/connect`}>
<div className='system-xs-medium text-text-tertiary group-hover:text-text-accent'>{t('dataset.connectDataset')}</div>
<RiArrowRightLine className='h-3.5 w-3.5 text-text-tertiary group-hover:text-text-accent' />
</a>
</Link>
</div>
)
}

@ -8,15 +8,17 @@ import { useRouter } from 'next/navigation'
import { useEffect } from 'react'
export default function DatasetsLayout({ children }: { children: React.ReactNode }) {
const { isCurrentWorkspaceEditor } = useAppContext()
const { isCurrentWorkspaceEditor, isCurrentWorkspaceDatasetOperator, currentWorkspace, isLoadingCurrentWorkspace } = useAppContext()
const router = useRouter()
useEffect(() => {
if (!isCurrentWorkspaceEditor)
if (isLoadingCurrentWorkspace || !currentWorkspace.id)
return
if (!(isCurrentWorkspaceEditor || isCurrentWorkspaceDatasetOperator))
router.replace('/apps')
}, [isCurrentWorkspaceEditor, router])
}, [isCurrentWorkspaceEditor, isCurrentWorkspaceDatasetOperator, isLoadingCurrentWorkspace, currentWorkspace, router])
if (!isCurrentWorkspaceEditor)
if (isLoadingCurrentWorkspace || !(isCurrentWorkspaceEditor || isCurrentWorkspaceDatasetOperator))
return <Loading type='app' />
return (
<ExternalKnowledgeApiProvider>

@ -54,7 +54,7 @@ import { Row, Col, Properties, Property, Heading, SubProperty, PropertyInstructi
</Property>
<Property name='indexing_technique' type='string' key='indexing_technique'>
Index mode
- <code>high_quality</code> High quality: embedding using embedding model, built as vector database index
- <code>high_quality</code> High quality: Embedding using embedding model, built as vector database index
- <code>economy</code> Economy: Build using inverted index of keyword table index
</Property>
<Property name='doc_form' type='string' key='doc_form'>
@ -1124,6 +1124,63 @@ import { Row, Col, Properties, Property, Heading, SubProperty, PropertyInstructi
<hr className='ml-0 mr-0' />
<Heading
url='/datasets/{dataset_id}/documents/status/{action}'
method='PATCH'
title='Update Document Status'
name='#batch_document_status'
/>
<Row>
<Col>
### Path
<Properties>
<Property name='dataset_id' type='string' key='dataset_id'>
Knowledge ID
</Property>
<Property name='action' type='string' key='action'>
- `enable` - Enable document
- `disable` - Disable document
- `archive` - Archive document
- `un_archive` - Unarchive document
</Property>
</Properties>
### Request Body
<Properties>
<Property name='document_ids' type='array[string]' key='document_ids'>
List of document IDs
</Property>
</Properties>
</Col>
<Col sticky>
<CodeGroup
title="Request"
tag="PATCH"
label="/datasets/{dataset_id}/documents/status/{action}"
targetCode={`curl --location --request PATCH '${props.apiBaseUrl}/datasets/{dataset_id}/documents/status/{action}' \\\n--header 'Authorization: Bearer {api_key}' \\\n--header 'Content-Type: application/json' \\\n--data-raw '{\n "document_ids": ["doc-id-1", "doc-id-2"]\n}'`}
>
```bash {{ title: 'cURL' }}
curl --location --request PATCH '${props.apiBaseUrl}/datasets/{dataset_id}/documents/status/{action}' \
--header 'Authorization: Bearer {api_key}' \
--header 'Content-Type: application/json' \
--data-raw '{
"document_ids": ["doc-id-1", "doc-id-2"]
}'
```
</CodeGroup>
<CodeGroup title="Response">
```json {{ title: 'Response' }}
{
"result": "success"
}
```
</CodeGroup>
</Col>
</Row>
<hr className='ml-0 mr-0' />
<Heading
url='/datasets/{dataset_id}/documents/{document_id}/segments'
method='POST'

@ -881,6 +881,63 @@ import { Row, Col, Properties, Property, Heading, SubProperty, PropertyInstructi
<hr className='ml-0 mr-0' />
<Heading
url='/datasets/{dataset_id}/documents/status/{action}'
method='PATCH'
title='ドキュメントステータスの更新'
name='#batch_document_status'
/>
<Row>
<Col>
### パス
<Properties>
<Property name='dataset_id' type='string' key='dataset_id'>
ナレッジ ID
</Property>
<Property name='action' type='string' key='action'>
- `enable` - ドキュメントを有効化
- `disable` - ドキュメントを無効化
- `archive` - ドキュメントをアーカイブ
- `un_archive` - ドキュメントのアーカイブを解除
</Property>
</Properties>
### リクエストボディ
<Properties>
<Property name='document_ids' type='array[string]' key='document_ids'>
ドキュメントIDのリスト
</Property>
</Properties>
</Col>
<Col sticky>
<CodeGroup
title="リクエスト"
tag="PATCH"
label="/datasets/{dataset_id}/documents/status/{action}"
targetCode={`curl --location --request PATCH '${props.apiBaseUrl}/datasets/{dataset_id}/documents/status/{action}' \\\n--header 'Authorization: Bearer {api_key}' \\\n--header 'Content-Type: application/json' \\\n--data-raw '{\n "document_ids": ["doc-id-1", "doc-id-2"]\n}'`}
>
```bash {{ title: 'cURL' }}
curl --location --request PATCH '${props.apiBaseUrl}/datasets/{dataset_id}/documents/status/{action}' \
--header 'Authorization: Bearer {api_key}' \
--header 'Content-Type: application/json' \
--data-raw '{
"document_ids": ["doc-id-1", "doc-id-2"]
}'
```
</CodeGroup>
<CodeGroup title="レスポンス">
```json {{ title: 'Response' }}
{
"result": "success"
}
```
</CodeGroup>
</Col>
</Row>
<hr className='ml-0 mr-0' />
<Heading
url='/datasets/{dataset_id}/documents/{document_id}/segments'
method='POST'
@ -2413,3 +2470,4 @@ import { Row, Col, Properties, Property, Heading, SubProperty, PropertyInstructi
</tbody>
</table>
<div className="pb-4" />

@ -55,7 +55,7 @@ import { Row, Col, Properties, Property, Heading, SubProperty, PropertyInstructi
<Property name='indexing_technique' type='string' key='indexing_technique'>
索引方式
- <code>high_quality</code> 高质量:使用
ding 模型进行嵌入,构建为向量数据库索引
Embedding 模型进行嵌入,构建为向量数据库索引
- <code>economy</code> 经济:使用 keyword table index 的倒排索引进行构建
</Property>
<Property name='doc_form' type='string' key='doc_form'>
@ -1131,6 +1131,63 @@ import { Row, Col, Properties, Property, Heading, SubProperty, PropertyInstructi
<hr className='ml-0 mr-0' />
<Heading
url='/datasets/{dataset_id}/documents/status/{action}'
method='PATCH'
title='更新文档状态'
name='#batch_document_status'
/>
<Row>
<Col>
### Path
<Properties>
<Property name='dataset_id' type='string' key='dataset_id'>
知识库 ID
</Property>
<Property name='action' type='string' key='action'>
- `enable` - 启用文档
- `disable` - 禁用文档
- `archive` - 归档文档
- `un_archive` - 取消归档文档
</Property>
</Properties>
### Request Body
<Properties>
<Property name='document_ids' type='array[string]' key='document_ids'>
文档ID列表
</Property>
</Properties>
</Col>
<Col sticky>
<CodeGroup
title="Request"
tag="PATCH"
label="/datasets/{dataset_id}/documents/status/{action}"
targetCode={`curl --location --request PATCH '${props.apiBaseUrl}/datasets/{dataset_id}/documents/status/{action}' \\\n--header 'Authorization: Bearer {api_key}' \\\n--header 'Content-Type: application/json' \\\n--data-raw '{\n "document_ids": ["doc-id-1", "doc-id-2"]\n}'`}
>
```bash {{ title: 'cURL' }}
curl --location --request PATCH '${props.apiBaseUrl}/datasets/{dataset_id}/documents/status/{action}' \
--header 'Authorization: Bearer {api_key}' \
--header 'Content-Type: application/json' \
--data-raw '{
"document_ids": ["doc-id-1", "doc-id-2"]
}'
```
</CodeGroup>
<CodeGroup title="Response">
```json {{ title: 'Response' }}
{
"result": "success"
}
```
</CodeGroup>
</Col>
</Row>
<hr className='ml-0 mr-0' />
<Heading
url='/datasets/{dataset_id}/documents/{document_id}/segments'
method='POST'

@ -19,7 +19,7 @@ const Layout: FC<{
const [isLoading, setIsLoading] = useState(true)
useEffect(() => {
(async () => {
if (!systemFeatures.webapp_auth.enabled) {
if (!isGlobalPending && !systemFeatures.webapp_auth.enabled) {
setIsLoading(false)
return
}
@ -37,7 +37,7 @@ const Layout: FC<{
setWebAppAccessMode(ret?.accessMode || AccessMode.PUBLIC)
setIsLoading(false)
})()
}, [pathname, redirectUrl, setWebAppAccessMode])
}, [pathname, redirectUrl, setWebAppAccessMode, isGlobalPending, systemFeatures.webapp_auth.enabled])
if (isLoading || isGlobalPending) {
return <div className='flex h-full w-full items-center justify-center'>
<Loading />

@ -314,10 +314,10 @@ const AppPublisher = ({
{!isAppAccessSet && <p className='system-xs-regular mt-1 text-text-warning'>{t('app.publishApp.notSetDesc')}</p>}
</div>}
<div className='flex flex-col gap-y-1 border-t-[0.5px] border-t-divider-regular p-4 pt-3'>
<Tooltip triggerClassName='flex' disabled={!systemFeatures.webapp_auth.enabled || userCanAccessApp?.result} popupContent={t('app.noAccessPermission')} asChild={false}>
<Tooltip triggerClassName='flex' disabled={!systemFeatures.webapp_auth.enabled || appDetail?.access_mode === AccessMode.EXTERNAL_MEMBERS || userCanAccessApp?.result} popupContent={t('app.noAccessPermission')} asChild={false}>
<SuggestedAction
className='flex-1'
disabled={!publishedAt || (systemFeatures.webapp_auth.enabled && !userCanAccessApp?.result)}
disabled={!publishedAt || (systemFeatures.webapp_auth.enabled && appDetail?.access_mode !== AccessMode.EXTERNAL_MEMBERS && !userCanAccessApp?.result)}
link={appURL}
icon={<RiPlayCircleLine className='h-4 w-4' />}
>
@ -326,10 +326,10 @@ const AppPublisher = ({
</Tooltip>
{appDetail?.mode === 'workflow' || appDetail?.mode === 'completion'
? (
<Tooltip triggerClassName='flex' disabled={!systemFeatures.webapp_auth.enabled || userCanAccessApp?.result} popupContent={t('app.noAccessPermission')} asChild={false}>
<Tooltip triggerClassName='flex' disabled={!systemFeatures.webapp_auth.enabled || appDetail.access_mode === AccessMode.EXTERNAL_MEMBERS || userCanAccessApp?.result} popupContent={t('app.noAccessPermission')} asChild={false}>
<SuggestedAction
className='flex-1'
disabled={!publishedAt || (systemFeatures.webapp_auth.enabled && !userCanAccessApp?.result)}
disabled={!publishedAt || (systemFeatures.webapp_auth.enabled && appDetail.access_mode !== AccessMode.EXTERNAL_MEMBERS && !userCanAccessApp?.result)}
link={`${appURL}${appURL.includes('?') ? '&' : '?'}mode=batch`}
icon={<RiPlayList2Line className='h-4 w-4' />}
>

@ -10,7 +10,6 @@ import PromptEditorHeightResizeWrap from './prompt-editor-height-resize-wrap'
import cn from '@/utils/classnames'
import type { PromptVariable } from '@/models/debug'
import Tooltip from '@/app/components/base/tooltip'
import type { CompletionParams } from '@/types/app'
import { AppType } from '@/types/app'
import { getNewVar, getVars } from '@/utils/var'
import AutomaticBtn from '@/app/components/app/configuration/config/automatic/automatic-btn'
@ -63,7 +62,6 @@ const Prompt: FC<ISimplePromptInput> = ({
const { eventEmitter } = useEventEmitterContextContext()
const {
modelConfig,
completionParams,
dataSets,
setModelConfig,
setPrevPromptConfig,
@ -264,14 +262,6 @@ const Prompt: FC<ISimplePromptInput> = ({
{showAutomatic && (
<GetAutomaticResModal
mode={mode as AppType}
model={
{
provider: modelConfig.provider,
name: modelConfig.model_id,
mode: modelConfig.mode,
completion_params: completionParams as CompletionParams,
}
}
isShow={showAutomatic}
onClose={showAutomaticFalse}
onFinished={handleAutomaticRes}

Some files were not shown because too many files have changed in this diff Show More

Loading…
Cancel
Save