One of the main features of LogicalDOC is full-text indexing of all documents to provide instant search results based on the content of files and metadata.
Taking advantage of the best technologies, LogicalDOC automatically indexes the complete content of the documents in the repository. To maximize performance and increase concurrency, the indexing procedure is asynchronous with a configurable scheduling policy.
The search engine is highly configurable, and you can define the item count, the repository that stores the index, and the order used to process documents. Furthermore, it is possible to apply Inclusion/Exclusion filters to limit the number of documents to be processed (even applying them only to metadata), define dimensions and limits of the analyzed threads and establish Batch and parsing timeout (to determine the number of documents processed and the time maximum to process a single document).
The extracted text passes through a series of configurable filters, which transform it into a standardized form suitable for indexing. In the Filters tab, all the available Stemmer, Worddelimiter and Ngram filters are displayed, which in turn can be further configured in their specific parameters.
The control panel offers numerous languages that you can easily enable or disable. Different algorithms are applied depending on the language of the document, so that the search will be tailored and able to detect word variants that are specific for a particular language.
When a user searches, the search engine consults the index to find relevant matches and returns results based on the specified criteria.
This process ensures that documents are quickly searchable and that content is indexed efficiently, allowing users to find accurate and relevant information within their document archives.
Benefits of this feature
- Drastically reduces the need for data entry since the full content of a document is automatically indexed, and this is enough to find the desired information
- Users are able to find the required information in a few seconds
Feature details
- Background indexing with sheduling policies
- Indexing algorithms for many languages
- Support for the mostly used office formats (Microsoft Office, Open Office, PDF and many more)
- Integrated OCR for extracting texts from images and raster PDFs
- Search and Indexing from the Administrators' Guide
- Full-text Search from the Users' Guide
- Watch the video