fix: remove legacy files in enterprise version

pull/19898/head
GareArc 9 months ago
parent e7003902b7
commit 820476c57e
No known key found for this signature in database

@ -1,3 +0,0 @@
{
"MD024": false
}

@ -1,32 +0,0 @@
# Changelog
All notable changes to Dify will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [0.15.7] - 2025-04-27
### Added
- Added support for GPT-4.1 in model providers (#18912)
- Added support for Amazon Bedrock DeepSeek-R1 model (#18908)
- Added support for Amazon Bedrock Claude Sonnet 3.7 model (#18788)
- Refined version compatibility logic in app DSL service
### Fixed
- Fixed issue with creating apps from template categories (#18807, #18868)
- Fixed DSL version check when creating apps from explore templates (#18872, #18878)
## [0.15.6] - 2025-04-22
### Security
- Fixed clickjacking vulnerability (#18552)
- Fixed reset password security issue (#18366)
- Updated reset password token when email code verification succeeds (#18362)
### Fixed
- Fixed Vertex AI Gemini 2.0 Flash 001 schema (#18405)

@ -1,115 +0,0 @@
model: us.anthropic.claude-3-7-sonnet-20250219-v1:0
label:
en_US: Claude 3.7 Sonnet(US.Cross Region Inference)
icon: icon_s_en.svg
model_type: llm
features:
- agent-thought
- vision
- tool-call
- stream-tool-call
model_properties:
mode: chat
context_size: 200000
# docs: https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-anthropic-claude-messages.html
parameter_rules:
- name: enable_cache
label:
zh_Hans: 启用提示缓存
en_US: Enable Prompt Cache
type: boolean
required: false
default: true
help:
zh_Hans: 启用提示缓存可以提高性能并降低成本。Claude 3.7 Sonnet支持在system、messages和tools字段中使用缓存检查点。
en_US: Enable prompt caching to improve performance and reduce costs. Claude 3.7 Sonnet supports cache checkpoints in system, messages, and tools fields.
- name: reasoning_type
label:
zh_Hans: 推理配置
en_US: Reasoning Type
type: boolean
required: false
default: false
placeholder:
zh_Hans: 设置推理配置
en_US: Set reasoning configuration
help:
zh_Hans: 控制模型的推理能力。启用时temperature将固定为1且top_p将被禁用。
en_US: Controls the model's reasoning capability. When enabled, temperature will be fixed to 1 and top_p will be disabled.
- name: reasoning_budget
show_on:
- variable: reasoning_type
value: true
label:
zh_Hans: 推理预算
en_US: Reasoning Budget
type: int
default: 1024
min: 0
max: 128000
help:
zh_Hans: 推理的预算限制最小1024必须小于max_tokens。仅在推理类型为enabled时可用。
en_US: Budget limit for reasoning (minimum 1024), must be less than max_tokens. Only available when reasoning type is enabled.
- name: max_tokens
use_template: max_tokens
required: true
label:
zh_Hans: 最大token数
en_US: Max Tokens
type: int
default: 8192
min: 1
max: 128000
help:
zh_Hans: 停止前生成的最大令牌数。请注意Anthropic Claude 模型可能会在达到 max_tokens 的值之前停止生成令牌。不同的 Anthropic Claude 模型对此参数具有不同的最大值。
en_US: The maximum number of tokens to generate before stopping. Note that Anthropic Claude models might stop generating tokens before reaching the value of max_tokens. Different Anthropic Claude models have different maximum values for this parameter.
- name: temperature
use_template: temperature
required: false
label:
zh_Hans: 模型温度
en_US: Model Temperature
type: float
default: 1
min: 0.0
max: 1.0
help:
zh_Hans: 生成内容的随机性。当推理功能启用时该值将被固定为1。
en_US: The amount of randomness injected into the response. When reasoning is enabled, this value will be fixed to 1.
- name: top_p
show_on:
- variable: reasoning_type
value: disabled
use_template: top_p
label:
zh_Hans: Top P
en_US: Top P
required: false
type: float
default: 0.999
min: 0.000
max: 1.000
help:
zh_Hans: 在核采样中的概率阈值。当推理功能启用时,该参数将被禁用。
en_US: The probability threshold in nucleus sampling. When reasoning is enabled, this parameter will be disabled.
- name: top_k
label:
zh_Hans: 取样数量
en_US: Top k
required: false
type: int
default: 0
min: 0
# tip docs from aws has error, max value is 500
max: 500
help:
zh_Hans: 对于每个后续标记,仅从前 K 个选项中进行采样。使用 top_k 删除长尾低概率响应。
en_US: Only sample from the top K options for each subsequent token. Use top_k to remove long tail low probability responses.
- name: response_format
use_template: response_format
pricing:
input: '0.003'
output: '0.015'
unit: '0.001'
currency: USD

@ -1,896 +0,0 @@
# standard import
import base64
import json
import logging
from collections.abc import Generator
from typing import Optional, Union, cast
# 3rd import
import boto3 # type: ignore
from botocore.config import Config # type: ignore
from botocore.exceptions import ( # type: ignore
ClientError,
EndpointConnectionError,
NoRegionError,
ServiceNotInRegionError,
UnknownServiceError,
)
# local import
from core.model_runtime.callbacks.base_callback import Callback
from core.model_runtime.entities.llm_entities import LLMResult, LLMResultChunk, LLMResultChunkDelta
from core.model_runtime.entities.message_entities import (
AssistantPromptMessage,
ImagePromptMessageContent,
PromptMessage,
PromptMessageContentType,
PromptMessageTool,
SystemPromptMessage,
TextPromptMessageContent,
ToolPromptMessage,
UserPromptMessage,
)
from core.model_runtime.errors.invoke import (
InvokeAuthorizationError,
InvokeBadRequestError,
InvokeConnectionError,
InvokeError,
InvokeRateLimitError,
InvokeServerUnavailableError,
)
from core.model_runtime.errors.validate import CredentialsValidateFailedError
from core.model_runtime.model_providers.__base.large_language_model import LargeLanguageModel
from core.model_runtime.model_providers.bedrock.get_bedrock_client import get_bedrock_client
logger = logging.getLogger(__name__)
ANTHROPIC_BLOCK_MODE_PROMPT = """You should always follow the instructions and output a valid {{block}} object.
The structure of the {{block}} object you can found in the instructions, use {"answer": "$your_answer"} as the default structure
if you are not sure about the structure.
<instructions>
{{instructions}}
</instructions>
""" # noqa: E501
class BedrockLargeLanguageModel(LargeLanguageModel):
# please refer to the documentation: https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference.html
# TODO There is invoke issue: context limit on Cohere Model, will add them after fixed.
CONVERSE_API_ENABLED_MODEL_INFO = [
{"prefix": "anthropic.claude-v2", "support_system_prompts": True, "support_tool_use": False},
{"prefix": "us.deepseek", "support_system_prompts": True, "support_tool_use": False},
{"prefix": "anthropic.claude-v1", "support_system_prompts": True, "support_tool_use": False},
{"prefix": "us.anthropic.claude-3", "support_system_prompts": True, "support_tool_use": True},
{"prefix": "eu.anthropic.claude-3", "support_system_prompts": True, "support_tool_use": True},
{"prefix": "anthropic.claude-3", "support_system_prompts": True, "support_tool_use": True},
{"prefix": "us.meta.llama3-2", "support_system_prompts": True, "support_tool_use": True},
{"prefix": "meta.llama", "support_system_prompts": True, "support_tool_use": False},
{"prefix": "mistral.mistral-7b-instruct", "support_system_prompts": False, "support_tool_use": False},
{"prefix": "mistral.mixtral-8x7b-instruct", "support_system_prompts": False, "support_tool_use": False},
{"prefix": "mistral.mistral-large", "support_system_prompts": True, "support_tool_use": True},
{"prefix": "mistral.mistral-small", "support_system_prompts": True, "support_tool_use": True},
{"prefix": "cohere.command-r", "support_system_prompts": True, "support_tool_use": True},
{"prefix": "amazon.titan", "support_system_prompts": False, "support_tool_use": False},
{"prefix": "ai21.jamba-1-5", "support_system_prompts": True, "support_tool_use": False},
{"prefix": "amazon.nova", "support_system_prompts": True, "support_tool_use": False},
{"prefix": "us.amazon.nova", "support_system_prompts": True, "support_tool_use": False},
]
@staticmethod
def _find_model_info(model_id):
for model in BedrockLargeLanguageModel.CONVERSE_API_ENABLED_MODEL_INFO:
if model_id.startswith(model["prefix"]):
return model
logger.info(f"current model id: {model_id} did not support by Converse API")
return None
def _code_block_mode_wrapper(
self,
model: str,
credentials: dict,
prompt_messages: list[PromptMessage],
model_parameters: dict,
tools: Optional[list[PromptMessageTool]] = None,
stop: Optional[list[str]] = None,
stream: bool = True,
user: Optional[str] = None,
callbacks: Optional[list[Callback]] = None,
) -> Union[LLMResult, Generator]:
"""
Code block mode wrapper for invoking large language model
"""
if model_parameters.get("response_format"):
stop = stop or []
if "```\n" not in stop:
stop.append("```\n")
if "\n```" not in stop:
stop.append("\n```")
response_format = model_parameters.pop("response_format")
format_prompt = SystemPromptMessage(
content=ANTHROPIC_BLOCK_MODE_PROMPT.replace("{{instructions}}", prompt_messages[0].content).replace(
"{{block}}", response_format
)
)
if len(prompt_messages) > 0 and isinstance(prompt_messages[0], SystemPromptMessage):
prompt_messages[0] = format_prompt
else:
prompt_messages.insert(0, format_prompt)
prompt_messages.append(AssistantPromptMessage(content=f"\n```{response_format}"))
return self._invoke(model, credentials, prompt_messages, model_parameters, tools, stop, stream, user)
def _invoke(
self,
model: str,
credentials: dict,
prompt_messages: list[PromptMessage],
model_parameters: dict,
tools: Optional[list[PromptMessageTool]] = None,
stop: Optional[list[str]] = None,
stream: bool = True,
user: Optional[str] = None,
) -> Union[LLMResult, Generator]:
"""
Invoke large language model
:param model: model name
:param credentials: model credentials
:param prompt_messages: prompt messages
:param model_parameters: model parameters
:param tools: tools for tool calling
:param stop: stop words
:param stream: is stream response
:param user: unique user id
:return: full response or stream response chunk generator result
"""
model_info = BedrockLargeLanguageModel._find_model_info(model)
if model_info:
model_info["model"] = model
# invoke models via boto3 converse API
return self._generate_with_converse(
model_info, credentials, prompt_messages, model_parameters, stop, stream, user, tools
)
# invoke other models via boto3 client
return self._generate(model, credentials, prompt_messages, model_parameters, stop, stream, user)
def _generate_with_converse(
self,
model_info: dict,
credentials: dict,
prompt_messages: list[PromptMessage],
model_parameters: dict,
stop: Optional[list[str]] = None,
stream: bool = True,
user: Optional[str] = None,
tools: Optional[list[PromptMessageTool]] = None,
) -> Union[LLMResult, Generator]:
"""
Invoke large language model with converse API
:param model_info: model information
:param credentials: model credentials
:param prompt_messages: prompt messages
:param model_parameters: model parameters
:param stop: stop words
:param stream: is stream response
:return: full response or stream response chunk generator result
"""
bedrock_client = get_bedrock_client("bedrock-runtime", credentials)
system, prompt_message_dicts = self._convert_converse_prompt_messages(prompt_messages)
inference_config, additional_model_fields = self._convert_converse_api_model_parameters(model_parameters, stop)
parameters = {
"modelId": model_info["model"],
"messages": prompt_message_dicts,
"inferenceConfig": inference_config,
"additionalModelRequestFields": additional_model_fields,
}
if model_info["support_system_prompts"] and system and len(system) > 0:
parameters["system"] = system
if model_info["support_tool_use"] and tools:
parameters["toolConfig"] = self._convert_converse_tool_config(tools=tools)
try:
# for issue #10976
conversations_list = parameters["messages"]
# if two consecutive user messages found, combine them into one message
for i in range(len(conversations_list) - 2, -1, -1):
if conversations_list[i]["role"] == conversations_list[i + 1]["role"]:
conversations_list[i]["content"].extend(conversations_list.pop(i + 1)["content"])
if stream:
response = bedrock_client.converse_stream(**parameters)
return self._handle_converse_stream_response(
model_info["model"], credentials, response, prompt_messages
)
else:
response = bedrock_client.converse(**parameters)
return self._handle_converse_response(model_info["model"], credentials, response, prompt_messages)
except ClientError as ex:
error_code = ex.response["Error"]["Code"]
full_error_msg = f"{error_code}: {ex.response['Error']['Message']}"
raise self._map_client_to_invoke_error(error_code, full_error_msg)
except (EndpointConnectionError, NoRegionError, ServiceNotInRegionError) as ex:
raise InvokeConnectionError(str(ex))
except UnknownServiceError as ex:
raise InvokeServerUnavailableError(str(ex))
except Exception as ex:
raise InvokeError(str(ex))
def _handle_converse_response(
self, model: str, credentials: dict, response: dict, prompt_messages: list[PromptMessage]
) -> LLMResult:
"""
Handle llm chat response
:param model: model name
:param credentials: credentials
:param response: response
:param prompt_messages: prompt messages
:return: full response chunk generator result
"""
response_content = response["output"]["message"]["content"]
# transform assistant message to prompt message
if response["stopReason"] == "tool_use":
tool_calls = []
text, tool_use = self._extract_tool_use(response_content)
tool_call = AssistantPromptMessage.ToolCall(
id=tool_use["toolUseId"],
type="function",
function=AssistantPromptMessage.ToolCall.ToolCallFunction(
name=tool_use["name"], arguments=json.dumps(tool_use["input"])
),
)
tool_calls.append(tool_call)
assistant_prompt_message = AssistantPromptMessage(content=text, tool_calls=tool_calls)
else:
assistant_prompt_message = AssistantPromptMessage(content=response_content[0]["text"])
# calculate num tokens
if response["usage"]:
# transform usage
prompt_tokens = response["usage"]["inputTokens"]
completion_tokens = response["usage"]["outputTokens"]
else:
# calculate num tokens
prompt_tokens = self.get_num_tokens(model, credentials, prompt_messages)
completion_tokens = self.get_num_tokens(model, credentials, [assistant_prompt_message])
# transform usage
usage = self._calc_response_usage(model, credentials, prompt_tokens, completion_tokens)
result = LLMResult(
model=model,
prompt_messages=prompt_messages,
message=assistant_prompt_message,
usage=usage,
)
return result
def _extract_tool_use(self, content: dict) -> tuple[str, dict]:
tool_use = {}
text = ""
for item in content:
if "toolUse" in item:
tool_use = item["toolUse"]
elif "text" in item:
text = item["text"]
else:
raise ValueError(f"Got unknown item: {item}")
return text, tool_use
def _handle_converse_stream_response(
self,
model: str,
credentials: dict,
response: dict,
prompt_messages: list[PromptMessage],
) -> Generator:
"""
Handle llm chat stream response
:param model: model name
:param credentials: credentials
:param response: response
:param prompt_messages: prompt messages
:return: full response or stream response chunk generator result
"""
try:
full_assistant_content = ""
return_model = None
input_tokens = 0
output_tokens = 0
finish_reason = None
index = 0
tool_calls: list[AssistantPromptMessage.ToolCall] = []
tool_use = {}
for chunk in response["stream"]:
if "messageStart" in chunk:
return_model = model
elif "messageStop" in chunk:
finish_reason = chunk["messageStop"]["stopReason"]
elif "contentBlockStart" in chunk:
tool = chunk["contentBlockStart"]["start"]["toolUse"]
tool_use["toolUseId"] = tool["toolUseId"]
tool_use["name"] = tool["name"]
elif "metadata" in chunk:
input_tokens = chunk["metadata"]["usage"]["inputTokens"]
output_tokens = chunk["metadata"]["usage"]["outputTokens"]
usage = self._calc_response_usage(model, credentials, input_tokens, output_tokens)
yield LLMResultChunk(
model=return_model,
prompt_messages=prompt_messages,
delta=LLMResultChunkDelta(
index=index,
message=AssistantPromptMessage(content="", tool_calls=tool_calls),
finish_reason=finish_reason,
usage=usage,
),
)
elif "contentBlockDelta" in chunk:
delta = chunk["contentBlockDelta"]["delta"]
if "text" in delta:
chunk_text = delta["text"] or ""
full_assistant_content += chunk_text
assistant_prompt_message = AssistantPromptMessage(
content=chunk_text or "",
)
index = chunk["contentBlockDelta"]["contentBlockIndex"]
yield LLMResultChunk(
model=model,
prompt_messages=prompt_messages,
delta=LLMResultChunkDelta(
index=index + 1,
message=assistant_prompt_message,
),
)
elif "toolUse" in delta:
if "input" not in tool_use:
tool_use["input"] = ""
tool_use["input"] += delta["toolUse"]["input"]
elif "contentBlockStop" in chunk:
if "input" in tool_use:
tool_call = AssistantPromptMessage.ToolCall(
id=tool_use["toolUseId"],
type="function",
function=AssistantPromptMessage.ToolCall.ToolCallFunction(
name=tool_use["name"], arguments=tool_use["input"]
),
)
tool_calls.append(tool_call)
tool_use = {}
except Exception as ex:
raise InvokeError(str(ex))
def _convert_converse_api_model_parameters(
self, model_parameters: dict, stop: Optional[list[str]] = None
) -> tuple[dict, dict]:
inference_config = {}
additional_model_fields = {}
if "max_tokens" in model_parameters:
inference_config["maxTokens"] = model_parameters["max_tokens"]
if "temperature" in model_parameters:
inference_config["temperature"] = model_parameters["temperature"]
if "top_p" in model_parameters:
inference_config["topP"] = model_parameters["temperature"]
if stop:
inference_config["stopSequences"] = stop
if "top_k" in model_parameters:
additional_model_fields["top_k"] = model_parameters["top_k"]
return inference_config, additional_model_fields
def _convert_converse_prompt_messages(self, prompt_messages: list[PromptMessage]) -> tuple[str, list[dict]]:
"""
Convert prompt messages to dict list and system
"""
system = []
prompt_message_dicts = []
for message in prompt_messages:
if isinstance(message, SystemPromptMessage):
message.content = message.content.strip()
system.append({"text": message.content})
else:
prompt_message_dicts.append(self._convert_prompt_message_to_dict(message))
return system, prompt_message_dicts
def _convert_converse_tool_config(self, tools: Optional[list[PromptMessageTool]] = None) -> dict:
tool_config = {}
configs = []
if tools:
for tool in tools:
configs.append(
{
"toolSpec": {
"name": tool.name,
"description": tool.description,
"inputSchema": {"json": tool.parameters},
}
}
)
tool_config["tools"] = configs
return tool_config
def _convert_prompt_message_to_dict(self, message: PromptMessage) -> dict:
"""
Convert PromptMessage to dict
"""
if isinstance(message, UserPromptMessage):
message = cast(UserPromptMessage, message)
if isinstance(message.content, str):
message_dict = {"role": "user", "content": [{"text": message.content}]}
else:
sub_messages = []
for message_content in message.content:
if message_content.type == PromptMessageContentType.TEXT:
message_content = cast(TextPromptMessageContent, message_content)
sub_message_dict = {"text": message_content.data}
sub_messages.append(sub_message_dict)
elif message_content.type == PromptMessageContentType.IMAGE:
message_content = cast(ImagePromptMessageContent, message_content)
data_split = message_content.data.split(";base64,")
mime_type = data_split[0].replace("data:", "")
base64_data = data_split[1]
image_content = base64.b64decode(base64_data)
if mime_type not in {"image/jpeg", "image/png", "image/gif", "image/webp"}:
raise ValueError(
f"Unsupported image type {mime_type}, "
f"only support image/jpeg, image/png, image/gif, and image/webp"
)
sub_message_dict = {
"image": {"format": mime_type.replace("image/", ""), "source": {"bytes": image_content}}
}
sub_messages.append(sub_message_dict)
message_dict = {"role": "user", "content": sub_messages}
elif isinstance(message, AssistantPromptMessage):
message = cast(AssistantPromptMessage, message)
if message.tool_calls:
message_dict = {
"role": "assistant",
"content": [
{
"toolUse": {
"toolUseId": message.tool_calls[0].id,
"name": message.tool_calls[0].function.name,
"input": json.loads(message.tool_calls[0].function.arguments),
}
}
],
}
else:
message_dict = {"role": "assistant", "content": [{"text": message.content}]}
elif isinstance(message, SystemPromptMessage):
message = cast(SystemPromptMessage, message)
message_dict = [{"text": message.content}]
elif isinstance(message, ToolPromptMessage):
message = cast(ToolPromptMessage, message)
message_dict = {
"role": "user",
"content": [
{
"toolResult": {
"toolUseId": message.tool_call_id,
"content": [{"json": {"text": message.content}}],
}
}
],
}
else:
raise ValueError(f"Got unknown type {message}")
return message_dict
def get_num_tokens(
self,
model: str,
credentials: dict,
prompt_messages: list[PromptMessage] | str,
tools: Optional[list[PromptMessageTool]] = None,
) -> int:
"""
Get number of tokens for given prompt messages
:param model: model name
:param credentials: model credentials
:param prompt_messages: prompt messages or message string
:param tools: tools for tool calling
:return:md = genai.GenerativeModel(model)
"""
prefix = model.split(".")[0]
model_name = model.split(".")[1]
if isinstance(prompt_messages, str):
prompt = prompt_messages
else:
prompt = self._convert_messages_to_prompt(prompt_messages, prefix, model_name)
return self._get_num_tokens_by_gpt2(prompt)
def validate_credentials(self, model: str, credentials: dict) -> None:
"""
Validate model credentials
:param model: model name
:param credentials: model credentials
:return:
"""
required_params = {}
if "anthropic" in model:
required_params = {
"max_tokens": 32,
}
elif "ai21" in model:
# ValidationException: Malformed input request: #/temperature: expected type: Number,
# found: Null#/maxTokens: expected type: Integer, found: Null#/topP: expected type: Number, found: Null,
# please reformat your input and try again.
required_params = {
"temperature": 0.7,
"topP": 0.9,
"maxTokens": 32,
}
try:
ping_message = UserPromptMessage(content="ping")
self._invoke(
model=model,
credentials=credentials,
prompt_messages=[ping_message],
model_parameters=required_params,
stream=False,
)
except ClientError as ex:
error_code = ex.response["Error"]["Code"]
full_error_msg = f"{error_code}: {ex.response['Error']['Message']}"
raise CredentialsValidateFailedError(str(self._map_client_to_invoke_error(error_code, full_error_msg)))
except Exception as ex:
raise CredentialsValidateFailedError(str(ex))
def _convert_one_message_to_text(
self, message: PromptMessage, model_prefix: str, model_name: Optional[str] = None
) -> str:
"""
Convert a single message to a string.
:param message: PromptMessage to convert.
:return: String representation of the message.
"""
human_prompt_prefix = ""
human_prompt_postfix = ""
ai_prompt = ""
content = message.content
if isinstance(message, UserPromptMessage):
body = content
if isinstance(content, list):
body = "".join([c.data for c in content if c.type == PromptMessageContentType.TEXT])
message_text = f"{human_prompt_prefix} {body} {human_prompt_postfix}"
elif isinstance(message, AssistantPromptMessage):
message_text = f"{ai_prompt} {content}"
elif isinstance(message, SystemPromptMessage):
message_text = content
elif isinstance(message, ToolPromptMessage):
message_text = f"{human_prompt_prefix} {message.content}"
else:
raise ValueError(f"Got unknown type {message}")
return message_text
def _convert_messages_to_prompt(
self, messages: list[PromptMessage], model_prefix: str, model_name: Optional[str] = None
) -> str:
"""
Format a list of messages into a full prompt for the Anthropic, Amazon and Llama models
:param messages: List of PromptMessage to combine.
:param model_name: specific model name.Optional,just to distinguish llama2 and llama3
:return: Combined string with necessary human_prompt and ai_prompt tags.
"""
if not messages:
return ""
messages = messages.copy() # don't mutate the original list
if not isinstance(messages[-1], AssistantPromptMessage):
messages.append(AssistantPromptMessage(content=""))
text = "".join(self._convert_one_message_to_text(message, model_prefix, model_name) for message in messages)
# trim off the trailing ' ' that might come from the "Assistant: "
return text.rstrip()
def _create_payload(
self,
model: str,
prompt_messages: list[PromptMessage],
model_parameters: dict,
stop: Optional[list[str]] = None,
stream: bool = True,
):
"""
Create payload for bedrock api call depending on model provider
"""
payload = {}
model_prefix = model.split(".")[0]
model_name = model.split(".")[1]
if model_prefix == "ai21":
payload["temperature"] = model_parameters.get("temperature")
payload["topP"] = model_parameters.get("topP")
payload["maxTokens"] = model_parameters.get("maxTokens")
payload["prompt"] = self._convert_messages_to_prompt(prompt_messages, model_prefix)
if model_parameters.get("presencePenalty"):
payload["presencePenalty"] = {model_parameters.get("presencePenalty")}
if model_parameters.get("frequencyPenalty"):
payload["frequencyPenalty"] = {model_parameters.get("frequencyPenalty")}
if model_parameters.get("countPenalty"):
payload["countPenalty"] = {model_parameters.get("countPenalty")}
elif model_prefix == "cohere":
payload = {**model_parameters}
payload["prompt"] = prompt_messages[0].content
payload["stream"] = stream
else:
raise ValueError(f"Got unknown model prefix {model_prefix}")
return payload
def _generate(
self,
model: str,
credentials: dict,
prompt_messages: list[PromptMessage],
model_parameters: dict,
stop: Optional[list[str]] = None,
stream: bool = True,
user: Optional[str] = None,
) -> Union[LLMResult, Generator]:
"""
Invoke large language model
:param model: model name
:param credentials: credentials kwargs
:param prompt_messages: prompt messages
:param model_parameters: model parameters
:param stop: stop words
:param stream: is stream response
:param user: unique user id
:return: full response or stream response chunk generator result
"""
client_config = Config(region_name=credentials["aws_region"])
runtime_client = boto3.client(
service_name="bedrock-runtime",
config=client_config,
aws_access_key_id=credentials.get("aws_access_key_id"),
aws_secret_access_key=credentials.get("aws_secret_access_key"),
)
model_prefix = model.split(".")[0]
payload = self._create_payload(model, prompt_messages, model_parameters, stop, stream)
# need workaround for ai21 models which doesn't support streaming
if stream and model_prefix != "ai21":
invoke = runtime_client.invoke_model_with_response_stream
else:
invoke = runtime_client.invoke_model
try:
body_jsonstr = json.dumps(payload)
response = invoke(modelId=model, contentType="application/json", accept="*/*", body=body_jsonstr)
except ClientError as ex:
error_code = ex.response["Error"]["Code"]
full_error_msg = f"{error_code}: {ex.response['Error']['Message']}"
raise self._map_client_to_invoke_error(error_code, full_error_msg)
except (EndpointConnectionError, NoRegionError, ServiceNotInRegionError) as ex:
raise InvokeConnectionError(str(ex))
except UnknownServiceError as ex:
raise InvokeServerUnavailableError(str(ex))
except Exception as ex:
raise InvokeError(str(ex))
if stream:
return self._handle_generate_stream_response(model, credentials, response, prompt_messages)
return self._handle_generate_response(model, credentials, response, prompt_messages)
def _handle_generate_response(
self, model: str, credentials: dict, response: dict, prompt_messages: list[PromptMessage]
) -> LLMResult:
"""
Handle llm response
:param model: model name
:param credentials: credentials
:param response: response
:param prompt_messages: prompt messages
:return: llm response
"""
response_body = json.loads(response.get("body").read().decode("utf-8"))
finish_reason = response_body.get("error")
if finish_reason is not None:
raise InvokeError(finish_reason)
# get output text and calculate num tokens based on model / provider
model_prefix = model.split(".")[0]
if model_prefix == "ai21":
output = response_body.get("completions")[0].get("data").get("text")
prompt_tokens = len(response_body.get("prompt").get("tokens"))
completion_tokens = len(response_body.get("completions")[0].get("data").get("tokens"))
elif model_prefix == "cohere":
output = response_body.get("generations")[0].get("text")
prompt_tokens = self.get_num_tokens(model, credentials, prompt_messages)
completion_tokens = self.get_num_tokens(model, credentials, output or "")
else:
raise ValueError(f"Got unknown model prefix {model_prefix} when handling block response")
# construct assistant message from output
assistant_prompt_message = AssistantPromptMessage(content=output)
# calculate usage
usage = self._calc_response_usage(model, credentials, prompt_tokens, completion_tokens)
# construct response
result = LLMResult(
model=model,
prompt_messages=prompt_messages,
message=assistant_prompt_message,
usage=usage,
)
return result
def _handle_generate_stream_response(
self, model: str, credentials: dict, response: dict, prompt_messages: list[PromptMessage]
) -> Generator:
"""
Handle llm stream response
:param model: model name
:param credentials: credentials
:param response: response
:param prompt_messages: prompt messages
:return: llm response chunk generator result
"""
model_prefix = model.split(".")[0]
if model_prefix == "ai21":
response_body = json.loads(response.get("body").read().decode("utf-8"))
content = response_body.get("completions")[0].get("data").get("text")
finish_reason = response_body.get("completions")[0].get("finish_reason")
prompt_tokens = len(response_body.get("prompt").get("tokens"))
completion_tokens = len(response_body.get("completions")[0].get("data").get("tokens"))
usage = self._calc_response_usage(model, credentials, prompt_tokens, completion_tokens)
yield LLMResultChunk(
model=model,
prompt_messages=prompt_messages,
delta=LLMResultChunkDelta(
index=0, message=AssistantPromptMessage(content=content), finish_reason=finish_reason, usage=usage
),
)
return
stream = response.get("body")
if not stream:
raise InvokeError("No response body")
index = -1
for event in stream:
chunk = event.get("chunk")
if not chunk:
exception_name = next(iter(event))
full_ex_msg = f"{exception_name}: {event[exception_name]['message']}"
raise self._map_client_to_invoke_error(exception_name, full_ex_msg)
payload = json.loads(chunk.get("bytes").decode())
model_prefix = model.split(".")[0]
if model_prefix == "cohere":
content_delta = payload.get("text")
finish_reason = payload.get("finish_reason")
else:
raise ValueError(f"Got unknown model prefix {model_prefix} when handling stream response")
# transform assistant message to prompt message
assistant_prompt_message = AssistantPromptMessage(
content=content_delta or "",
)
index += 1
if not finish_reason:
yield LLMResultChunk(
model=model,
prompt_messages=prompt_messages,
delta=LLMResultChunkDelta(index=index, message=assistant_prompt_message),
)
else:
# get num tokens from metrics in last chunk
prompt_tokens = payload["amazon-bedrock-invocationMetrics"]["inputTokenCount"]
completion_tokens = payload["amazon-bedrock-invocationMetrics"]["outputTokenCount"]
# transform usage
usage = self._calc_response_usage(model, credentials, prompt_tokens, completion_tokens)
yield LLMResultChunk(
model=model,
prompt_messages=prompt_messages,
delta=LLMResultChunkDelta(
index=index, message=assistant_prompt_message, finish_reason=finish_reason, usage=usage
),
)
@property
def _invoke_error_mapping(self) -> dict[type[InvokeError], list[type[Exception]]]:
"""
Map model invoke error to unified error
The key is the ermd = genai.GenerativeModel(model) error type thrown to the caller
The value is the md = genai.GenerativeModel(model) error type thrown by the model,
which needs to be converted into a unified error type for the caller.
:return: Invoke emd = genai.GenerativeModel(model) error mapping
"""
return {
InvokeConnectionError: [],
InvokeServerUnavailableError: [],
InvokeRateLimitError: [],
InvokeAuthorizationError: [],
InvokeBadRequestError: [],
}
def _map_client_to_invoke_error(self, error_code: str, error_msg: str) -> type[InvokeError]:
"""
Map client error to invoke error
:param error_code: error code
:param error_msg: error message
:return: invoke error
"""
if error_code == "AccessDeniedException":
return InvokeAuthorizationError(error_msg)
elif error_code in {"ResourceNotFoundException", "ValidationException"}:
return InvokeBadRequestError(error_msg)
elif error_code in {"ThrottlingException", "ServiceQuotaExceededException"}:
return InvokeRateLimitError(error_msg)
elif error_code in {
"ModelTimeoutException",
"ModelErrorException",
"InternalServerException",
"ModelNotReadyException",
}:
return InvokeServerUnavailableError(error_msg)
elif error_code == "ModelStreamErrorException":
return InvokeConnectionError(error_msg)
return InvokeError(error_msg)

@ -1,63 +0,0 @@
model: us.deepseek.r1-v1:0
label:
en_US: DeepSeek-R1(US.Cross Region Inference)
icon: icon_s_en.svg
model_type: llm
features:
- agent-thought
- vision
- tool-call
- stream-tool-call
model_properties:
mode: chat
context_size: 32768
parameter_rules:
- name: max_tokens
use_template: max_tokens
required: true
label:
zh_Hans: 最大token数
en_US: Max Tokens
type: int
default: 8192
min: 1
max: 128000
help:
zh_Hans: 停止前生成的最大令牌数。
en_US: The maximum number of tokens to generate before stopping.
- name: temperature
use_template: temperature
required: false
label:
zh_Hans: 模型温度
en_US: Model Temperature
type: float
default: 1
min: 0.0
max: 1.0
help:
zh_Hans: 生成内容的随机性。当推理功能启用时该值将被固定为1。
en_US: The amount of randomness injected into the response. When reasoning is enabled, this value will be fixed to 1.
- name: top_p
show_on:
- variable: reasoning_type
value: disabled
use_template: top_p
label:
zh_Hans: Top P
en_US: Top P
required: false
type: float
default: 0.999
min: 0.000
max: 1.000
help:
zh_Hans: 在核采样中的概率阈值。当推理功能启用时,该参数将被禁用。
en_US: The probability threshold in nucleus sampling. When reasoning is enabled, this parameter will be disabled.
- name: response_format
use_template: response_format
pricing:
input: '0.001'
output: '0.005'
unit: '0.001'
currency: USD

@ -1,28 +0,0 @@
import logging
from core.model_runtime.entities.model_entities import ModelType
from core.model_runtime.errors.validate import CredentialsValidateFailedError
from core.model_runtime.model_providers.__base.model_provider import ModelProvider
logger = logging.getLogger(__name__)
class GoogleProvider(ModelProvider):
def validate_provider_credentials(self, credentials: dict) -> None:
"""
Validate provider credentials
if validate failed, raise exception
:param credentials: provider credentials, credentials form defined in `provider_credential_schema`.
"""
try:
model_instance = self.get_model_instance(ModelType.LLM)
# Use `gemini-2.0-flash` model for validate,
model_instance.validate_credentials(model="gemini-2.0-flash", credentials=credentials)
except CredentialsValidateFailedError as ex:
raise ex
except Exception as ex:
logger.exception(f"{self.get_provider_schema().provider} credentials validate failed")
raise ex

@ -1,21 +0,0 @@
- gemini-2.0-flash-001
- gemini-2.0-flash-exp
- gemini-2.0-pro-exp-02-05
- gemini-2.0-flash-thinking-exp-1219
- gemini-2.0-flash-thinking-exp-01-21
- gemini-1.5-pro
- gemini-1.5-pro-latest
- gemini-1.5-pro-001
- gemini-1.5-pro-002
- gemini-1.5-pro-exp-0801
- gemini-1.5-pro-exp-0827
- gemini-1.5-flash
- gemini-1.5-flash-latest
- gemini-1.5-flash-001
- gemini-1.5-flash-002
- gemini-1.5-flash-exp-0827
- gemini-1.5-flash-8b-exp-0827
- gemini-1.5-flash-8b-exp-0924
- gemini-exp-1206
- gemini-exp-1121
- gemini-exp-1114

@ -1,33 +0,0 @@
- gpt-4.1
- o1
- o1-2024-12-17
- o1-mini
- o1-mini-2024-09-12
- o3-mini
- o3-mini-2025-01-31
- gpt-4
- gpt-4o
- gpt-4o-2024-05-13
- gpt-4o-2024-08-06
- gpt-4o-2024-11-20
- chatgpt-4o-latest
- gpt-4o-mini
- gpt-4o-mini-2024-07-18
- gpt-4-turbo
- gpt-4-turbo-2024-04-09
- gpt-4-turbo-preview
- gpt-4-32k
- gpt-4-1106-preview
- gpt-4-0125-preview
- gpt-4-vision-preview
- gpt-3.5-turbo
- gpt-3.5-turbo-16k
- gpt-3.5-turbo-16k-0613
- gpt-3.5-turbo-0125
- gpt-3.5-turbo-1106
- gpt-3.5-turbo-0613
- gpt-3.5-turbo-instruct
- gpt-4o-audio-preview
- o1-preview
- o1-preview-2024-09-12
- text-davinci-003

@ -1,60 +0,0 @@
model: gpt-4.1
label:
zh_Hans: gpt-4.1
en_US: gpt-4.1
model_type: llm
features:
- multi-tool-call
- agent-thought
- stream-tool-call
- vision
model_properties:
mode: chat
context_size: 1047576
parameter_rules:
- name: temperature
use_template: temperature
- name: top_p
use_template: top_p
- name: presence_penalty
use_template: presence_penalty
- name: frequency_penalty
use_template: frequency_penalty
- name: max_tokens
use_template: max_tokens
default: 512
min: 1
max: 32768
- name: reasoning_effort
label:
zh_Hans: 推理工作
en_US: Reasoning Effort
type: string
help:
zh_Hans: 限制推理模型的推理工作
en_US: Constrains effort on reasoning for reasoning models
required: false
options:
- low
- medium
- high
- name: response_format
label:
zh_Hans: 回复格式
en_US: Response Format
type: string
help:
zh_Hans: 指定模型必须输出的格式
en_US: specifying the format that the model must output
required: false
options:
- text
- json_object
- json_schema
- name: json_schema
use_template: json_schema
pricing:
input: '2.00'
output: '8.00'
unit: '0.000001'
currency: USD

File diff suppressed because it is too large Load Diff

@ -1,37 +0,0 @@
model: gemini-2.0-flash-001
label:
en_US: Gemini 2.0 Flash 001
model_type: llm
features:
- agent-thought
- vision
model_properties:
mode: chat
context_size: 1048576
parameter_rules:
- name: temperature
use_template: temperature
- name: top_p
use_template: top_p
- name: top_k
label:
en_US: Top k
type: int
help:
en_US: Only sample from the top K options for each subsequent token.
required: false
- name: presence_penalty
use_template: presence_penalty
- name: frequency_penalty
use_template: frequency_penalty
- name: max_output_tokens
use_template: max_tokens
required: true
default: 8192
min: 1
max: 8192
pricing:
input: '0.00'
output: '0.00'
unit: '0.000001'
currency: USD

@ -1,80 +0,0 @@
- google
- bing
- perplexity
- duckduckgo
- searchapi
- serper
- searxng
- websearch
- tavily
- stackexchange
- pubmed
- arxiv
- aws
- nominatim
- devdocs
- spider
- firecrawl
- brave
- crossref
- jina
- webscraper
- dalle
- azuredalle
- stability
- stablediffusion
- cogview
- comfyui
- getimgai
- siliconflow
- spark
- stepfun
- xinference
- alphavantage
- yahoo
- openweather
- gaode
- aippt
- chart
- youtube
- did
- dingtalk
- discord
- feishu
- feishu_base
- feishu_document
- feishu_message
- feishu_wiki
- feishu_task
- feishu_calendar
- feishu_spreadsheet
- lark_base
- lark_document
- lark_message_and_group
- lark_wiki
- lark_task
- lark_calendar
- lark_spreadsheet
- slack
- twilio
- wecom
- wikipedia
- code
- wolframalpha
- maths
- github
- gitlab
- time
- vectorizer
- qrcode
- tianditu
- aliyuque
- google_translate
- hap
- json_process
- judge0ce
- novitaai
- onebot
- regex
- trello
- fal

12392
api/poetry.lock generated

File diff suppressed because it is too large Load Diff
Loading…
Cancel
Save