Fulltext & Vector Search Functions
Available from version 11.48.0
These functions require the OPENSEARCH_ENABLED license/preference to be activated for your organization. Without it, all functions throw a RuntimeError("Fulltext search license is missing").
Functions for searching document archives, finding similar documents, and querying ERP master data. These search across all documents of the organization — unlike get_document_content() which only reads the current document's text.
Security: The org_id is automatically injected by the script sandbox. You never need to pass it — your scripts always operate within your own organization's data.
Source: module/script/helper/document_script_functions.py
fulltext_search()
Searches the full OCR text of all documents in the organization. Finds text in pages.pageText, tfidfCustomPageText and ai_text fields via the fulltextsearch microservice.
fulltext_search(query, **kwargs)Parameters:
query
str
required
Search term (searched in OCR text of all documents)
search_type
str
"match_phrase"
"match_phrase" (exact phrase), "fuzzy" (typo-tolerant, up to 2 char difference), "prefix" (starts with)
doc_type
str
None
Filter by document type (comma-separated, e.g. "INVOICE,CREDIT_NOTE")
status
str
None
Filter by document status (comma-separated, e.g. "ready_for_validation,exported")
vendor_name
str
None
Filter by vendor name
date_range
str
None
"last_30_days", "last_90_days", "last_180_days", "last_365_days"
size
int
10
Max results (capped at 50)
Returns: list[dict] — Each dict contains:
doc_id
Document UUID
name
Filename (e.g. "INV-2026-001.pdf")
doc_type
Document type ("INVOICE", "ORDER_CONFIRMATION", etc.)
vendor_name
Vendor name
status
Document status
total_amount
Total amount
ocr_content
Matched text excerpt from the document
highlights
Dict with highlighted matches per field
Example — Search for exact phrase:
Example — Fuzzy search (OCR typo tolerant):
Example — Prefix search:
Empty query: Passing an empty string returns [] immediately without making an HTTP call.
Error handling: If the fulltextsearch service is unreachable, the function returns [] and logs a warning. It does not throw an exception.
vector_search()
Finds semantically similar documents using vector embeddings (k-NN search with 384-dimensional vectors). Useful for finding documents with similar content regardless of exact wording.
Parameters:
doc_id
str
required
Source document UUID (the document to find similar matches for)
k
int
5
Number of similar documents to return (capped at 50)
Returns: list[dict] — Each dict contains:
doc_id
Similar document UUID
name
Filename
doc_type
Document type
similarity_score
Raw similarity score (0-1)
similarity_percent
Similarity as percentage (0-100)
Example — Find similar documents:
How it works: Each document is converted to a 384-dimensional vector when indexed. The vector search finds the nearest neighbors in this vector space, which correspond to semantically similar documents.
fulltext_search_erp()
Searches ERP master data (vendors, purchase orders, customers, materials) indexed in OpenSearch.
Parameters:
query
str
required
Search term
entity_types
str
None
Filter by entity type (comma-separated: "vendor", "purchase_order", "customer", "material")
vendor_number
str
None
Filter by vendor number
vendor_name
str
None
Filter by vendor name
company_code
str
None
Filter by company code
size
int
10
Max results (capped at 50)
Returns: list[dict] — Entity-type-specific fields (vendor records have vendor_number, vendor_name, etc.)
Example — Validate vendor in ERP:
Example — Search purchase orders:
fulltext_suggestions()
Returns autocomplete suggestions for search terms. Groups results by category (vendors, filenames, invoice numbers).
Parameters:
query
str
required
Prefix / search term
limit
int
10
Max suggestions per category (capped at 20)
Returns: dict with grouped suggestions:
Example — Get vendor suggestions:
Empty query: Passing an empty string returns {} immediately.
Quick Reference
fulltext_search(query, ...)
Search OCR text across all documents
list[dict]
vector_search(doc_id, ...)
Find semantically similar documents
list[dict]
fulltext_search_erp(query, ...)
Search ERP master data
list[dict]
fulltext_suggestions(query, ...)
Autocomplete suggestions
dict
Common Patterns
License Check
All four functions automatically check the OPENSEARCH_ENABLED preference. If not enabled:
To handle this gracefully in scripts:
Combining with Field Functions
Last updated
Was this helpful?