en_US:If you choose not to wait, it will directly return a job ID. You can use this job ID to check the crawling results or cancel the crawling task, which is usually very useful for a large-scale crawling task.
Only pages matching these patterns will be crawled. Example:blog/*, about/*
zh_Hans:只有与这些模式匹配的页面才会被爬取。示例:blog/*, about/*
Pages matching these patterns will be skipped. Example:blog/*, about/*
zh_Hans:匹配这些模式的页面将被跳过。示例:blog/*, about/*
form:form
- name:excludes
- name:includePaths
type:string
required:false
label:
en_US:URL patterns to exclude
zh_Hans:要排除的URL模式
en_US:URL patterns to include
zh_Hans:要包含的URL模式
placeholder:
en_US:Use commas to separate multiple tags
zh_Hans:多个标签时使用半角逗号分隔
human_description:
en_US:|
Pages matching these patterns will be skipped. Example:blog/*, about/*
zh_Hans:匹配这些模式的页面将被跳过。示例:blog/*, about/*
form:form
- name:returnOnlyUrls
type:boolean
default:false
label:
en_US:return Only Urls
zh_Hans:仅返回URL
human_description:
en_US:|
If true, returns only the URLs as a list on the crawl status. Attention:the return response will be a list of URLs inside the data, not a list of documents.
zh_Hans:只返回爬取到的网页链接,而不是网页内容本身。
Only pages matching these patterns will be crawled. Example:blog/*, about/*
en_US:The crawling mode to use. Fast mode crawls 4x faster websites without sitemap, but may not be as accurate and shouldn't be used in heavy js-rendered websites.
en_US:Enables the crawler to navigate from a specific URL to previously linked pages. For instance, from 'example.com/product/123' back to 'example.com/product'
The URL to send the webhook to. This will trigger for crawl started (crawl.started) ,every page crawled (crawl.page) and when the crawl is completed (crawl.completed or crawl.failed). The response will be the same as the /scrape endpoint.
en_US:Input a website and get all the urls on the website - extremly fast
zh_Hans:输入一个网站,快速获取网站上的所有网址。
llm:Input a website and get all the urls on the website - extremly fast
parameters:
- name:url
type:string
required:true
label:
en_US:Start URL
zh_Hans:起始URL
human_description:
en_US:The base URL to start crawling from.
zh_Hans:要爬取网站的起始URL。
llm_description:The URL of the website that needs to be crawled. This is a required parameter.
form:llm
- name:search
type:string
label:
en_US:search
zh_Hans:搜索查询
human_description:
en_US:Search query to use for mapping. During the Alpha phase, the 'smart' part of the search functionality is limited to 100 search results. However, if map finds more results, there is no limit applied.
llm_description:Search query to use for mapping. During the Alpha phase, the 'smart' part of the search functionality is limited to 100 search results. However, if map finds more results, there is no limit applied.
The extraction mode to use. 'markdown': Returns the scraped markdown content, does not perform LLM extraction. 'llm-extraction':Extracts information from the cleaned and parsed content using LLM.