Data Extraction with yuuvis® RAD extraction-service
yuuvis® RAD extraction-service is a microservice whose functions can be used to extract data from different file formats:
- EXIF data from audio, video and image files;
- XMP data from Office and PDF documents;
- E-invoices;
- ZUGFeRD (versions 1 and 2.3);
- Factur-X (version 1.0);
-
XRechnung (version 3.0.1) with the UN/CEFACT, Cross Industry Invoice, and Universal Business Language 2.1. syntax.
- Standard properties from e-mails in MSG or EML format.
The service is called in yuuvis® RAD client when newly created or modified documents are saved. The extracted data is saved as metadata if mapping is configured accordingly.
Extraction must be tested with sample data. Mapping can lead to errors, especially errors due to mismatched data types and formats.
You can test the extraction of the microservice:
-
Open the microservices administration page using the URL: http://<service-admin-IP>:<port>
Default port: 7273
yuuvis® RAD extraction-service is listed under the name EXTRACTION.
- Click EXTRACTION and then click again to open the extraction area.
-
Click Insights > Details.
The details page will open.
- Click the link to the Swagger UI.
- On the Extraction API page, select extraction-api from the list in the header.
- Click POST /extraction/api/xmp.
- Click Try it out!.
- Select file from file system.
- Start extraction by clicking Execute.
The data is extracted immediately and the result is shown.
The result has the following structure: "Alias": "Value".
If you want to transfer the data extracted with yuuvis® RAD extraction-service, the schema must be modified in yuuvis® RAD designer:
- For every metadata field of an object type into which the extracted data is to be imported, you have to enter the desired yuuvis® RAD alias from the Metadata mapping tables in the field properties.
If the creation form contains a metadata field into which only unique values can be entered, only one document is created with the first file to be imported into yuuvis® RAD even if several files have been imported. For this reason, the metadata fields of object types designated for data transfer may not contain the Unique property.
If extracted data exceeds the maximum length of metadata fields, it is truncated.
The extracted data of media and document files are edited in such a way that the value of the first suitable information is transferred to the metadata.
The notation of yuuvis® RAD aliases is namespace.Name.
Example:
extract.OS:Title
If you want to transfer the data extracted with yuuvis® RAD extraction-service, the schema must be modified in yuuvis® RAD designer. This is done, for example, by means of a document type for e-mails.
yuuvis® RAD extraction-service provides the assignments of default properties from e-mail files on yuuvis® RAD aliases (see Metadata Mapping).
Modifying the schema:
- Open the document type E-mail in yuuvis® RAD designer and go to the Fields area.
- Select the From field, add an alias, and enter the following information:
- In the Name property, enter OS:MailFrom.
- In the Namespace property, enter Extract.
- Set the field to read-only if necessary.
- Do the same for every other field of the document type.
- Save and enable the schema.
You have now completed the configuration process.
You can now switch to an e-mail program and transfer an e-mail to yuuvis® RAD. In yuuvis® RAD client, check that the metadata fields of the transferred e-mail are filled out correctly.
Plug-ins
yuuvis® RAD extraction-service can be supplemented with plug-ins that allow data to be extracted from configured file formats.
The documentation can be found in the developer area.