Frequently Asked Questions #
Application Behavior #
- Will the VAF application be able to analyze bulk volumes of documents, e.g. in a connected network folder, and then migrate, classify, and promote external documents that meet specific filter requirements?
The VAF application works with objects in M-Files reacting to events and object changes in the Vault. For instance, documents from a network folder can be imported into M-Files under a general classification, such as “Other Document.” Once in M-Files, a rule can be executed on all documents to classify documents and extract information.
NOTE: The VAF Application is not available in the Free Trial.
- How well does the model work with Multi-Lookup Fields, such as existing object types or value lists (not just Textlists)? Can we also identify objects with this method, e.g., linking the correct vendor name to an invoice?
Yes, the model works well with multi-select lookup fields and existing objects or value lists. When identifying objects, such as linking a vendor name to an invoice, the system will search and find the best match. With M-Files Information Extractor, you can only use it when the data is in a value list. However, our approach finds the best match, even if it involves minor adjustments like removing a dot, and you can configure this matching process. The same approach applies to multi-select lookup fields.
- Does the analysis have to be set to automatic or can it be a manual selection?
Analysis will immediately start when the document is added to M-Files, but a feature can be added to the next version to allow you to start the analysis manually.
- Is there any need for doing external OCR?
No, there is no need for external OCR. The Azure service includes OCR capabilities as part of its processing pipeline. When you send your documents to the service for data extraction or classification, OCR is automatically performed by Azure’s OCR services. You also have the option to use high-resolution OCR, which can provide higher-quality results. However, this option may incur additional costs under the pay-as-you-go model.
Language Support #
- How well do the prebuilt models work in languages other than English?
Azure AI models are capable of handling multilingual documents. When you use the service to process documents, the language for each field is automatically detected. This means if your document contains multiple languages across different pages, each language will be identified for its respective fields.
Most of the models are available in 40+ languages. The prebuilt contract model is available for the English language only. For a full list of languages, refer to the links below.
Language and locale support for Read and Layout document analysis
- What about content in multiple languages? Is there a way to use a specific model based on content language?
You have a few options to handle content in multiple languages. You can use existing pre-trained models designed to work across languages. These models provide outputs based on their predefined capabilities and confidence levels. Also, you can train a custom model that performs extraction across multiple languages. Alternatively, if your content primarily uses a specific language, you can create a custom model optimized for that language to achieve higher confidence and accuracy.
Product Comparison #
- What is the difference between the Analyze function in AI Document Kit and M-Files Information Extractor?
M-Files Information Extractor uses regular expressions (regex) and works only with value lists. In contrast, the Analyze function in AI Document Kit allows for learning a model that can extract all types of data, whether from value lists or objects. This function compares the extracted data and matches it based on confidence levels. If an exact match isn’t found, it will provide the best possible match using the information available. This makes the Analyze function more flexible and capable of handling a broader range of data types.
- How does your AI feature differ from M-Files’ solutions?
Aino and Copilot are based on Generative AI technologies, primarily used to create data from existing content. M-Files is currently focused on leveraging existing content within M-Files to help answer your questions. With Aino, you can provide context—such as a view, a document, a search, or your entire vault—and based on that context, you can ask a question and receive new content as an answer.
What we are doing here involves information extraction powered by Azure AI services. Our focus is on automatic data extraction, which is a different concept from Aino and Copilot.
- If a contract end date is given as a period in a contract, for example, 5 years after the start date, would it be possible to extract/calculate the exact date?
This would be a task of Generative AI models which will semantically extract the data from your document. AI Document Kit won’t semantically analyze the document and generate a value that is not present inside of it. However, if the text explicitly mentions a period of “5 years”, this value could be extracted. When both start and end dates are available in the document and are extracted with AI Document Kit, you can use our Extension Kit to calculate the period time and save it to a separate property.
- From a pricing standpoint, how does this compare to using both document classifier and smart metadata from Business Licensing?
You can use it regardless of the M-Files model (basic, business, or enterprise). The pricing is pay-as-you-use in the cloud, and we will not differentiate a lot from that, so it will be affordable.
Region and Privacy Details #
- For an on-premises setup, how CPU intensive is running AI Document Kit? For example, would I need a separate server?
The training process is performed in the cloud, so we send the information to the cloud service. This process is not CPU intensive, and you won’t need a separate server. If used on-premises, AI Document Kit can be installed within a separate container, allowing it to be managed at a container-based level.
- Which cloud service are you using?
We’re using the Azure services, but this service can be provisioned in your region, so the data will stay in your region. For example, if you’re from Switzerland, it will be provisioned to the Swiss region, if you’re from Germany, it will be provisioned to the German region, etc.
NOTE: On premises Document Intelligence is not available in the Free Trial.
- What is the server location? Do we need to set up our own Azure tenant, or does this go through your tenant?
The Azure AI service can be provisioned in any region of your choice. Our approach involves provisioning the service for you. If you have your own Azure tenant, you can use it. We will provide you with the necessary license to utilize your own Azure services. This flexibility ensures that you can choose the deployment location that best suits your needs, whether through our tenant or your own Azure environment.
- How is it possible that documents don’t go to the AI cloud and are analyzed locally (on-premises)?
Azure (Microsoft) allows us to run everything on-premises within your own environment using a container. There are two ways to do this:
- Pay-as-you-use Model: You can use the model on-premises and only need to connect to Azure once a month to report usage (i.e. the number of requests to the model, for example, 30 requests).
- Prepaid Requests: You can buy a package of 10,000 requests (where one request equals one page). With this option, you don’t need to communicate with Azure, as everything will be handled within your environment.
NOTE: Default Free trial resource group is provisioned in West Europe but this can be changed by request.