You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
gcgj-dify-1.7.0/api/core/rag/splitter
baonudesifeizhai c14a6a6609 fix: support Chinese regex separators in text segmentation
- Remove re.escape() usage to preserve regex functionality
- Replace str.split() with re.split() for regex support
- Replace 'in' operator with re.search() for regex patterns
- Add proper separator preservation logic for regex patterns
- Filter out short chunks containing only symbols

Fixes #22765
8 months ago
..
__init__.py Feat/delete single dataset retrival (#6570) 2 years ago
fixed_text_splitter.py fix: support Chinese regex separators in text segmentation 8 months ago
text_splitter.py fix: support Chinese regex separators in text segmentation 8 months ago