文档首页

Knowledge

Uploads and supported formats

Which file types FlexyAgents ingests, how large uploads are handled, images and media with optional AI enrichment, and editorial tips so chunks stay meaningful after embedding.

upload pdfdocumentsknowledge baseimagesaudiovideo

Uploads are the fastest way to bootstrap a knowledge base when your content already lives in files. FlexyAgents extracts text, splits it for embedding, and attaches it to the bases you select.

Images may go through OCR plus optional Gemini-powered description; audio and video may be transcribed when Gemini is available—see Documentation → Knowledge → AI vision, transcription & limits for quotas and BYOK.

Quality still matters: OCR’d PDFs, complex tables, and slide decks may need cleanup for reliable retrieval.

Supported formats

Common office formats (Word, Excel, CSV), PDFs, HTML/Markdown/JSON/XML, RTF, common archives (ZIP, RAR, TAR, GZ), images (JPEG, PNG, GIF, WebP, BMP, TIFF, SVG text extraction), and media (e.g. MP4, WebM, MP3, WAV, OGG) are supported in typical dashboards.

Exact allowlists and maximum file sizes can change—use the in-product uploader as the source of truth. Some archive types (e.g. 7z) may be blocked.

Password-protected or rights-managed files must be decrypted before upload.

  • PDFs: prefer text-native PDFs; scanned PDFs rely more heavily on OCR.
  • Images: supply meaningful filenames; alt text in source HTML of crawled pages helps even when vision is off.

Structuring for retrieval

Use headings, short sections, and explicit titles so chunk boundaries align with concepts. Avoid burying critical facts in images without alt text.

Duplicate content across files increases noise; deduplicate or designate a canonical document.

Volume and batch uploads

Product APIs and dashboard flows may accept multiple files per request with rate limits; stagger huge migrations to avoid timeouts and to observe embedding backlog.

For enterprise migrations, pair uploads with connectors when content continues to change in the source system.

Updates and versioning

Re-upload revised files when connectors are not available; note that large replacements may take time to re-embed.

Coordinate with legal when removing regulated content—delete or archive sources deliberately rather than leaving stale vectors.

在你的技术栈上构建

准备上线有依据的助手了吗?

开始试用,或与我们沟通上线、治理和企业级要求。