FlexyAgents can enrich knowledge with AI beyond plain text extraction: Gemini vision describes images so semantic search finds “screenshots of the billing page,” and Gemini transcription turns podcasts or training videos into searchable text.
On hosted infrastructure, your subscription may include separate monthly caps for (1) image recognition on uploads vs crawls, and (2) media transcription on uploads vs crawls. OCR (Tesseract-style text from images) and basic metadata do not replace vision but are separate—they are not what those “image recognition” quotas measure.
Adding a valid Google Gemini API key for your organization (Settings → LLM API Keys) routes vision and transcription calls through your Google account, so hosted FlexyAgents quotas for those operations are not incremented.