enaio® documentviewer
enaio® documentviewer integrates the preview of a marked enaio® document into clients as a document preview. enaio® documentviewer offers convenient features for viewing, navigating, and searching.
Due to the integrated components enaio® documentviewer and rendition cache, enaio® rendition also provides options for conversion of files into other file formats and text recognition in image files.
enaio® documentviewer consists of the following components:
- Web application
The web application structures how document previews are shown in clients.
- Conversion component enaio® rendition
enaio® rendition generates renditions from documents (images, PDFs, TIFF, text, thumbnails, etc.).
- Rendition cache storage component
The rendition cache is a cache memory for managing generated renditions centrally. Only one preview is generated per document. If one and the same document is sent multiple times or a reference is created, enaio® documentviewer reuses the preview from the rendition cache.
-
'applet' service
Service for HTML/JS/CSS frontend components.
When you send a document to an internal recipient, enaio® documentviewer allows you to include a reference to the preview file in the e-mail and activate the preview in enaio® webclient. In addition, a thumbnail of the first page of an attached document can be inserted into the e-mail body.
enaio® documentviewer is installed with the login as a local system account, but it must be run using an account with administrative rights.
Installation
enaio® documentviewer is installed via the component setup located in the directory \Backend\DocumentViewer.
Copy the directory onto the computer on which enaio® documentviewer is to be installed, since running the setup from a network can lead to errors.
The runtime environment (JDK and application server), the enaio® rendition conversion components, and Rendition Cache are also automatically installed.
The installed runtime environment should be used only for this core service, because the runtime environment is also updated when the core service is updated. If other enaio or third-party components are run in the runtime environment, update errors may occur or the components may not be work after an update anymore.
Setup automatically registers the service with the respective home URL and service endpoint on enaio® server. These URLs can be opened and edited in enaio® enterprise-manager under Server properties > Category: Services > Documentviewer. The registry keys are transferred to the client registry file during enaio® client installation and can be read from other components.
If several servers are used, you will need to manually change the URL addresses for all servers in enaio® enterprise-manager after installation.
The workflow monitoring that is integrated in enaio® documentviewer and runs automatically stops all processes that are not being used for a defined amount of time. For that reason, the service may not be executed by the default system user. Open Windows services administrative tool after installation and configure the service so that it is executed in a local system administrator account.
After installation, access to the rendition cache must be restricted to prevent unauthorized external access. To do so, open the administration page of enaio® documentviewer and navigate to the 'enaio renditioncache' area. Select Server settings > IP filter to specify the servers from which the rendition cache can be accessed, i.e., the IP address of enaio® on which the rendition cache (…\server\___ren.bat) is installed. This setting is specified as a regular expression. It is possible to specify more than one server.
In enaio® client, users can show or hide the document preview in the ribbon on the VIEW tab.
If users receive a message that a document preview cannot be displayed due to missing rights although they have document access rights, it may be necessary to activate the session cookies in the default Internet browser used on the workstations.
The viewing service is uninstalled using the component setup. Deinstallation via the Control Panel is not possible.
enaio® documentviewer logs to the <installdir>\logs directory using the default settings.
Environment
Rapid distribution of data can only be guaranteed if the computer on which enaio® documentviewer runs provides high-speed I/O communication (hard disk, memory, etc.).
Office documents are only shown in high quality with enaio® documentviewer if Microsoft Office has been installed on the computer.
The following steps are necessary on the enaio® documentviewer machine to integrate Microsoft Office:
- Create the following directory:
%windir%\syswow64\config\Systemprofile\Desktop
For Microsoft Office 32-bit: %windir%\system32\config\Systemprofile\Desktop
- Using the administrator service user account, set the path to temporary Internet files to an accessible directory via the Internet options:
Internet options > General > Settings > Temporary Internet files > Move folders
- Fonts must be installed with the 'for all users' option enabled so that they can be used by enaio® documentviewer.
- Start each of the Office applications once using the administrative service user’s account and confirm the dialogs.
- Additional steps for Office 365:
- Open component services (DCOMCNFG).
- Navigate to: Console root > Component services > Computer > Workstation > Properties > COM security > Startup and activation permissions > Edit default.
- Add the enaio® documentviewer technical user as a user with the Local startup and Local activation rights.
- Restart the service.
enaio® documentviewer does not need Microsoft Office to work. However, the quality of the displayed Office documents is lower.
Ghostscript must be installed to generate TIFF files from the PDF format. Due to license requirements, Ghostscript must be installed separately.
Users can view previews only if they have the appropriate access rights. When document previews are retrieved, an entry is created in the history for both the technical user and the user logged in to enaio® client.
Note that access rights are not automatically assigned to new document types and, if required, you will need to also grant read rights for the technical user.
PDF Preview
Starting with version 9.0, enaio® documentviewer can be set to show previews in PDF format.
PDF preview is enabled in enaio® enterprise-manager under Settings > Server properties > Category: Services > DocumentViewer using the following Home URL:
http://<service-manager>/applet/pdfview/viewer.html?osid={OBJECTIDENT}&pagecount={pagecount}&sessionguid={sessionguid}&servername={servername}&serverport={serverport}&objecttype={objecttype}&q={searchterm}
This preview setting is also used in enaio® client, enaio® webclient, and enaio® contentviewer for the preview URL.
The document preview can sometimes only show documents with restrictions, such as forms and PDFs with signatures.
Due to technical limitations of the integrated PDF.js viewer (within a Chromium), there may be cases where a document preview cannot be displayed for files larger than 512 MB. The user will receive a notification about this should it occur.
Configuration
Content Processing Bus
Core service coordination, particularly with regard to data flows when creating and modifying documents and index data, is controlled by a central content processing bus (CPB).
Saving messages about changes made to documents, references, variants, and index data in queues designed specifically for this purpose is the task of the CPB. Queues are monitored and managed in enaio® enterprise-manager.
Batches integrated in the queues generate these messages and are also responsible for their deletion from the CPB once a core service has requested the message and executed the corresponding job successfully. If a core service does not work properly anymore, it notifies the CPB to move the requested messages back to the queues, so that they can be requested again.
After installing enaio®, the CPB is set up by default for use on an enaio® server. Queues and batches are provided with default values, and no settings need to be configured.
The CPB may only be deactivated when authorized by the support or consulting team.
The following queues are set up for the CPB:
RENDITION |
Queue for generating renditions. They are imported by enaio® documentviewer. |
FULLTEXTIDX |
Queue for full-text indexing of index data. It is imported by the component set up for full-text indexing, i.e., enaio® fulltext or MS SQL Server. |
FULLTEXTDOC |
Queue for full-text indexing of documents. |
SLIDE |
Queue for generating renditions in the SLIDE cache. It is read out by the SLIDE cache. Provided that the SLIDE cache has not been disabled in enaio® enterprise-manager, renditions generated by enaio® documentviewer will be saved here. |
PAGECOUNT |
Queues for determining the number of pages in documents. It is read out by the document preview and the object information. |
Multiple instances of each queue can be configured, but it is not possible to create additional queue types for the CPB.
Queues and queue instances can be configured in enaio® enterprise-manager in the Server properties > Category: Services > Content Processing Bus area.
The following batches are set up by default for the queues:
ProcessSlideCPMessages |
Processing of messages for generating renditions in the SLIDE cache |
ProcessPageCountCPMessages |
Processing of messages for generating page numbers |
Batches can be configured in enaio® enterprise-manager in the Server properties > Category: Periodic jobs area. For all batches you can optionally specify whether and how many can be executed in parallel and in which server queue they should be executed.
In systems with multiple enaio® servers, batches for CPB queues can be transferred to one server for increased performance.
Aside from the basic configuration, core services do not have to be configured specifically for the CPB.
Full-text indexing of documents and index data as well as thumbnail generation can be activated for object types in enaio® editor.
The CPB offers extensive options to intelligently control the load during data processing. For example, batches can be distributed among several enaio® servers in order to improve performance. Alternatively, multiple core services of the same type, e.g., several enaio® documentviewer instances, can be used for job processing. A prerequisite for this is that the components involved must be configured accordingly and each core service instance must have an individual instance name. Our consulting team would be happy to assist you in configuring customized deployment scenarios.
To check the interaction between CPB and core services after installation or an update, you can monitor jobs in enaio® enterprise-manager in the Advanced administration > Monitoring > CP queues area. Here, you can verify whether messages could be processed or not.
Updating the Content Preview
enaio® documentviewer generates only one preview per document which is stored in the rendition cache and can be reused. After the contents of a document have been changed, a new preview is created; the view is not however automatically updated in the preview of enaio® client and the updated preview is only displayed when users click the Update button in the content preview header.
Alternatively, you can set up the automatic generation and display of a document preview once document content has been modified and checked in.
To do so, the following row must be added to the [System] area in the …\etc\as.cfg file of the data directory: RELOADAFTERDOCCHANGE=1
This entry enables the automatic generation and display of a document preview once document content has been modified and checked in.
Content Preview of Client-Encrypted Documents
Content previews of document files encrypted by enaio® client (see 'Client Encryption') can only be created if the document files to create the preview are decrypted by enaio® documentviewer.
To do so, you will need to modify the following configuration file:
…\services\OS_DocumentViewer\webapps\osrenditioncache\WEB-INF\classes\config\config.properties
Change the value of the sec.decrypt.cc parameter from 'false' to 'true'.
If required, adjust the value of the sec.decrypt.cc.timespan parameter. The parameter specifies the timespan after which the decrypted document files and previews are deleted again. The preset value is 7200000 seconds, i.e., two hours.
In this case, enaio® documentviewer always decrypts all document files.
Configuration on the Administration Page
enaio® documentviewer, enaio® rendition, and the rendition cache can be configured centrally on the administration page:
http://<gateway>/osdocumentviewer/admin
In environments with multiple installations of enaio® documentviewer, you can integrate these using the application-prod.yml configuration file in the …\services\OS_Gateway\apps\os_gateway\config\ directory:
- endpoint:
name: osdocumentviewer_2
url: 'http://<host:port>/osdocumentviewer'
The corresponding administration page is called up using:
http://<gateway>/osdocumentviewer_2/admin
.enaio® gateway may have to be restarted.
A default user name is pre-set for logging into the administration page and should be changed during the initial configuration of enaio® documentviewer in the 'documentviewer' area.
You will also need to set up a technical user for enaio® documentviewer. The technical user needs read access to document types from which renditions are to be generated. Without read rights, renditions will not be created from documents of that type. By assigning read rights, it is also possible to control the document types from which no previews will be generated, for example, because it is not generally desirable or because an adequate rendition quality cannot be achieved for certain document types due to the format. Users can view previews only if they have the appropriate access rights. Note that access rights are not automatically assigned to new document types and, if required, you will need to subsequently grant read rights for the technical user.
What is more, the 'Server: Switch job context' system role is required for the technical user. Without this system role, the user is not authorized to display content in the content preview of enaio® client.
The administration page is divided into the following areas: enaio® documentviewer, enaio® rendition, and the rendition cache. When you click in an area, the options related to the respective component are displayed.
The settings specified on the administration page will be saved to the files config.properties and route.properties located in the <installdir>\webapps\osrenditioncache\WEB-INF\classes\config\ directory.
If changes were made, you will have to restart the core service.
The following settings can be changed on the administration page:
enaio® documentviewer
General Settings |
|
Timeout (ms) |
Specify after how long (in ms) preview generation is canceled. Default: 300000 |
Temp directory |
Path to the conversion working directory. The temp directory contains temporary files for rendition generation and is automatically cleared on a regular basis by enaio® documentviewer. The directory should be on a local data carrier where rapid data access is possible. Standard: Path that was specified at installation, e.g., C:/enaio/services/OS_documentviewer/data/temp |
Administrator name |
User to log in to the administration page of enaio® documentviewer. It is only checked whether this user exists in the enaio® user administration; rights are not checked. |
enaio® rendition
Configuring Processing Routes |
|
Use MS Office |
If MS Office is used for document conversion, MS Office must be installed and customized on the computer that enaio® rendition runs on. Default: enabled |
Use Aspose |
Aspose is an alternative that is used if MS Office is not available on the computer. Aspose is automatically installed with enaio® documentviewer. Default: enabled Aspose creates PDFs in PDF/A-1 format by default. You can switch to PDF/A-2 with the following entry in the route.properties file located in the <installdir>\webapps\osrenditioncache\WEB-INF\classes\config\ directory: rendition-pdfa2=true |
Create PDF/A files |
Specify whether files are generated in PDF/A format via the function Send e-mail > Content (PDF) in enaio® client. Default: disabled |
Use CPE |
Specify whether enaio® documentviewer retrieves messages from the CPB. If the CPB has not been set up yet or is currently unavailable, it is advisable to deactivate this option. If not, enaio® documentviewer creates an error log entry each time it tries to retrieve messages from the CPB, which may slow down performance. Default: enabled |
Cache options |
|
Maximum cache size |
Specify the maximum size of the cache directory. Specify the unit using MB, GB, or TB. Default: 500 GB Recommendation for the minimum size: 100 GB |
Cache high-water mark (in percent) |
Upper limit for automatic cache cleanup When the high-water mark for the cache directory is reached, renditions are deleted until the low-water mark is reached. The oldest unmodified renditions are deleted first. Text renditions, i.e., OCR results, are never deleted. Default: 80 |
Cache low-water mark (in percent) |
Lower limit for automatic cache cleanup When the high-water mark for the cache directory is reached, renditions are deleted until the low-water mark is reached. The oldest unmodified renditions are deleted first. Text renditions, i.e., OCR results, are never deleted. Default: 60 |
Cache cleanup over age: Instead of a cache cleanup via the cache size, the high-water mark, and the low-water mark, you can set a single cleanup depending on the age in the following file: <installdir>\webapps\osrenditioncache\WEB-INF\classes\config\config.properties Change the value of cache.activeIndex to '1'. Adjust the value of cache.olderThanInSeconds. The preset value is 15811200 seconds (183 days). Restart the service. Text renditions, i.e., OCR results, are not deleted here either. Disable cache cleanup for an instance: If several enaio® documentviewer with the same cache are used and the cache clearing is to be moved to an instance, then the cache clearing function for individual enaio® documentviewer can be disabled: cache.activeIndex=2 |
|
Additional options |
|
Update MS Office form fields |
Establish whether form fields in MS Office documents are updated during conversion. Using the custom settings, you can define which form fields are updated. You can choose from date fields, fields with formulas and calculations, fields with page numbers, and fields with file names and file paths. If you activate update with Yes, only those fields with page numbers and fields containing formulas and calculations are updated. Default: no |
enaio renditioncache
Server settings |
|
Server connection |
Name or IP address of the server and its port are followed by the addressing probability. Entries must be separated by a colon. Default: server and port that were specified at installation, e.g., localhost:4000:100 Multiple servers (separated by '#') can be specified. |
Instance name |
Name of the instance under which the documentviewer instance is executed. The instance name must be unique to avoid conflicts with other core services. Default: Rendition |
Name of the technical user |
Server login name of the technical user. Preview generation is performed entirely with the technical user account. The technical user must therefore be granted read permissions for all document types for which a preview can be created. For display in enaio® documentviewer, he or she will also need the 'Server: Switch job context 'system role. |
Password of the technical user |
Server login password of the technical user. Preview generation is performed entirely with the technical user account. The technical user must therefore be granted read permissions for all document types for which a preview can be created. For display in enaio® documentviewer, he or she will also need the 'Server: Switch job context 'system role. Default: optimal |
Object history entry |
Each preview retrieval using enaio® documentviewer is logged in the object history. Specify the text for this entry here. Default: document was displayed for preview. |
IP filters |
Access to the rendition cache must be restricted via IP filters to avoid unauthorized external access. The default setting (*) allows any access. |
OCR engine |
Define whether text recognition with FineReader is enabled. |
Parallel OCR |
Specify for how many documents text recognition is run simultaneously. |
Working directories |
|
Cache directory |
Path to the cache directory of the rendition cache. It contains pre-generated preview documents and should be located on a data carrier with enough free space. Detailed information on what size data carrier is needed can be found in the document titled 'System Requirements'. Default: path that was specified at installation, e.g., C:/OSECM/Services/enaio documentviewer/data/cache |
Database directory |
Path to database directory It contains databases used in generating previews and should be located on a local data carrier where fast data access is possible. Default: path that was specified at installation, e.g., C:/OSECM/Services/enaio documentviewer/data/db |
Job directory |
Path to the internal job directory. It contains jobs used for generating previews and should be located on a local data carrier where fast data access is possible. Default: path that was specified at installation, e.g., C:/OSECM/Services/enaio documentviewer/data/jobs |
Session configuration |
|
User session timeout (ms) |
Specify after how much time (in ms) an inactive user session is closed. Default: 1200000 |
Check user session activity |
A separate job checks whether the current user session is still active. When this option is activated, enaio® documentviewer is better able to respond to network disruptions; however, the volume of network traffic also increases due to the higher levels of communication with the server. Default: enabled |
In enaio® enterprise-manager, LoginPipe exceptions can be configured in the Server properties > Category: General > Login area.
You will need to specify the user name as well as the IP address of the computer on which enaio® documentviewer is run. The user must also be assigned the 'Server: Switch job context' system role.
enaio® documentviewer uses the SHA-256 hash over the content for structured storage inside the cache. The algorithm is the same as the one used by the server. Therefore the server job 'std.GetDocumentDigest' is used to determine the hash. If the server job returns another hash value, such as MD5, enaio® documentviewer calculates the hash value itself. enaio® documentviewer may need to modify the hash value to ensure that it is unique. This occurs both with annotated documents – the server ignores them – and with archived documents. In the case of archived documents, it is possible that several documents are combined in one container and the job 'std.GetDocumentDigest' returns the hash value of the container for all documents in this container.
If enaio® documentviewer calculates a modified hash value, the existing renditions may be generated again and the cache enlarged as a result. If you can guarantee that the hash values are unique using the 'GetDigestById' job for the archiving of documents, then the archived.mask.digest = false parameter in the config.properties configuration file of the rendition cache can be entered to prevent enaio® documentviewer from modifying the hash value, thereby generating existing renditions again.
IP Filters
Access is required for enaio® server, enaio® services, enaio® fulltext, enaio® appconnector, and, if necessary, other computers in the services infrastructure. '127.0.0.1' must also be entered for communication between services.
Addresses are specified as regular expressions. In a list of addresses, IP addresses must always be enclosed in parentheses. Addresses are separated by the pipe character '|'.
Permitted access |
Sample configuration |
By all IP addresses |
.* |
By multiple addresses |
(10.10.10.10)|(10.10.10.11)| ... (10.10.10.1x) |
enaio® rendition and Rendition Cache
enaio® rendition provides a rendition service for converting files into different file formats and text recognition in image files. The conversion process for numerous source and target formats can be controlled in great detail and customized to meet individual requirements.
The rendition cache is the storage component of enaio® documentviewer which centrally manages the generated renditions and contains the conversion logic.
When installing enaio® documentviewer, the batch file ___ren.bat is copied to the server directory.
This file controls communication with the rendition cache via the REST client curl.exe, which is also installed, and ensures that all conversions to the PDF or TIF format that have not been configured yet using a batch file are forwarded to the rendition cache for processing.
If you manage only single-sided image documents in PDF format with the image modules, you can use enaio® rendition to convert them by disabling the internal image conversion in the Server properties > Category: General > Conversion area in enaio® enterprise-manager.
In the case of multiple-page image documents that are managed in PDF format, enaio® rendition would only convert the first page of each document. All other pages would be lost. As a result, enaio® rendition is not generally suited for conversion.
enaio® rendition enables W-Documents in enaio® client to be converted to PDF format under Send e-mail > Content (PDF) in the menu and then sent. The converted files comply with the PDF/A-1a standard, provided it was configured accordingly on the enaio® documentviewer administration page.
enaio® rendition automatically provides renditions with the object ID of the document belonging to the rendition. In this way, a rendition can be read directly from the rendition cache using the object ID when it is sent using the Send e-mail > Content (PDF) function in enaio® client, without the document first having to be transferred from enaio® server to enaio® rendition. This speeds up conversion and reduces the amount of data to be transmitted across the network.
You can configure the setting in enaio® enterprise-manager in the Server properties > Category: General > Conversion > Call renditions using object ID area. Rendition calls via the object ID are activated by default.
The following table gives an overview of which output formats can be converted to which target formats with enaio® rendition.
Source formats: |
Extension |
Target formats: Preview image |
|
TIFF |
PDF/A |
---|---|---|---|---|---|
Bitmap graphic |
bmp |
x |
x |
x |
x |
Comma-separated values |
csv |
x |
x |
x |
x |
Device-independent bitmap graphic |
dib |
x |
x |
x |
x |
Word document |
doc |
x |
x |
x |
x |
MS Word document with macros |
docm |
x |
x |
x |
x |
MS Word XML document |
docx |
x |
x |
x |
x |
MS Word document template |
dot |
x |
x |
x |
x |
MS Word XML document template |
dotx |
x |
x |
x |
x |
AutoCAD drawing |
dwg |
x1 |
x4 |
x1 |
x4 |
Drawing interchange file format |
dxf |
x1 |
x1 |
x1 |
x1 |
Extended (enhanced) Windows metafile format |
emf |
x2 |
x2 |
x2 |
x2 |
Outlook e-mail |
eml |
x2 |
x2 |
x2 |
x2 |
Encapsulated portable document format |
epdf |
x2 |
x2 |
x2 |
x2 |
EclipsePackager invoice |
epi |
x |
x |
x |
x |
Encapsulated PostScript |
eps5 |
x |
x |
x |
x |
Encapsulated PostScript |
epsf5 |
x |
x |
x |
x |
Encapsulated PostScript |
epsi5 |
x |
x |
x |
x |
High Efficiency Image Container | heic6 | x | x | x | x |
OpenEXR bitmap |
exr |
x |
x |
x |
x |
Graphics interchange format |
gif |
x |
x |
x |
x |
Windows icons |
ico |
x |
x |
x |
x |
Joint photographic experts group |
jpg |
x |
x |
x |
x |
MS project |
mpp |
x1 |
x1 |
x1 |
x1 |
Multipage TIFF bitmap |
mpt |
x1 |
x1 |
x1 |
x1 |
Microsoft Exchange mail document |
msg |
x |
x |
x |
x |
OpenDocument (Version 2) graphics document |
odg |
x |
x |
x |
x |
OpenDocument (Version 2) presentation |
odp |
x |
x |
x |
x |
OpenDocument (Version 2) spreadsheet |
ods |
x |
x |
x |
x |
OpenDocument (Version 2) text document |
odt |
x |
x |
x |
x |
Portable Bitmap Graphic |
pbm |
x |
x |
x |
x |
Picture Exchange |
pcx |
x |
x |
x |
x |
Portable Document Format |
|
x |
x |
x |
x |
Portable Network Graphics |
png |
x |
x |
x |
x |
Portable Anymap |
pnm |
x |
x |
x |
x |
PowerPoint templates |
pot |
x |
x |
x |
x |
MS Presentation template with macros |
potm |
x |
x |
x |
x |
MS Presentation template |
potx |
x |
x |
x |
x |
MS PowerPoint slideshow |
pps |
x |
x |
x |
x |
MS PowerPoint slideshow with macros |
ppsm |
x |
x |
x |
x |
MS PowerPoint XML slideshow |
ppsx |
x |
x |
x |
x |
MS PowerPoint presentation |
ppt |
x |
x |
x |
x |
MS PowerPoint presentation with macros |
pptm |
x |
x |
x |
x |
MS PowerPoint XML presentation |
pptx |
x |
x |
x |
x |
Post Script |
ps5 |
x |
x |
x |
x |
Post Script Level 2 |
ps25 |
x |
x |
x |
x |
Post Script Level 3 |
ps35 |
x |
x |
x |
x |
Pyramid-encoded TIFF |
ptif |
x |
x |
x |
x |
Rich Text Format |
rtf |
x |
x |
x |
x |
Scalable vector graphics |
svg |
x1 |
x1 |
x1 |
x |
Compressed scalable vector graphics |
svgz |
x1 |
x1 |
x1 |
x |
OpenOffice spreadsheet |
sxc |
x |
x |
x |
x |
OpenOffice presentation |
sxi |
x |
x |
x |
x |
OpenOffice text |
sxw |
x |
x |
x |
x |
Wireless bitmap file format |
wbmp |
x |
x |
x |
x |
Windows metafile |
wmf |
x |
x |
x |
x |
MS Works Word Processor |
wps |
x |
x |
x |
x |
X bitmap graphic |
xbm |
x |
x |
x |
x |
Gimp eXperimental Computing Facility |
xcf |
x |
x |
x |
x |
MS Excel binary workbook with macros |
xlsb |
x |
x |
x |
x |
MS Excel template with macros |
xlsm |
x |
x |
x |
x |
MS Excel workbook |
xlsx |
x |
x |
x |
x |
MS Excel workbook with macros |
xltm |
x |
x |
x |
x |
MS Excel template |
xltx |
x |
x |
x |
x |
Extensible Markup Language |
xml |
x |
x |
x |
x |
1 Requires Microsoft Office to be installed on the enaio® rendition computer.
2 Only the body is converted for these file formats.
4 In most cases, a converter, such as Any DWG, is necessary. Integration takes place in the project.
5 These documents can only be displayed in reduced quality. Additional converters are required to generate higher quality renditions.
6 Text recognition for this format is not supported.
The list of supported formats contains the formats that can usually be rendered using the converters included in the scope of supply. Because Microsoft Office documents may contain all kinds of embedded objects, rendering may not produce 100 percent accurate results.
In general, the Aspose converter can also convert formats that are not listed in the table above to PDF or PDF/A. Aspose may encounter display, formatting, or conversion errors when converting non-XML based Office formats in particular.
In addition to the listed formats, other formats can be rendered for specific projects with support from the professional services team at OPTIMAL SYSTEMS, as well as additional converters. If other converters are used, these must be able to be called from the command line and generate a PDF as the output format. Converters may not open dialogs that require input.
Converter for LibreOffice/OpenOffice
To convert LibreOffice/OpenOffice documents, you will need to have the relevant applications installed and the following files:
libreoffice2pdf.js and libreoffice2pdf.xml
The files must be copied to the following directory:
<installdir>\renditionplus\bin\custom
The libreoffice2pdf.js file must be modified:
-
executor.setWorkingDirectory([WorkingDirectory])
Path to program directory
Example: executor.setWorkingDirector("E:/LibreOffice/LibreOfficePortablePrevious/App/libreoffice/program")
-
executor.execute([soffice.exe], ...
Path to program directory with application
Example: executor.execute("E:/LibreOffice/LibreOfficePortablePrevious/App/libreoffice/program/soffice.exe", ...
The programs use presets for the conversion. If you start the programs and select the Export as PDF feature, the settings will be displayed and the changes will be saved and used for following conversions.
enaio® rendition and Quicklooks
If the Document type without slide creation property is disabled for document types in enaio® editor, enaio® rendition automatically creates Quicklooks. If the property is enabled, no Quicklooks are created.
Hard Disk Monitoring
Monitoring of the cache and temp directories is enabled by default. If the amount of free space falls below the threshold of 30 MB, then enaio® documentviewer only operates in read-only mode: No further CPB jobs are processed, no new OCR jobs are delivered, no new file jobs are processed, and no internal worker jobs are processed further.
The default settings can be adjusted via the configuration file config.properties from the …\webapps\osrenditioncache\WEB-INF\classes\config\ directory:
Parameters:
system.checkDiskSpace=true
system.minFreeCacheSpaceInMB=30
system.minFreeTempSpaceInMB=30
The read-only mode is stopped automatically once there is more space available.
Page Count
As of version 10.0, the page count is only created for the specified file formats . These are specified using the parameter system.pageCountSupportedFor of the config.properties configuration file from the …\webapps\osrenditioncache\WEB-INF\classes\config\ directory.
Default setting:
system.pageCountSupportedFor=application/vnd.ms, application/vnd.openxmlformats, application/ms, text/rtf, application/vnd.oasis.opendocument, application/pdf, image/
The setting can be changed. The content type does not need to be specified in its entirety.
Determining the Cache Size
Determination of the cache size is preset to 'heuristic'; the size is calculated based on average values then. A change to precise determination can be made via the config.properties configuration file located in the <installdir>\webapps\osrenditioncache\WEB-INF\classes\config\ directory:
cacheCount.activeIndex=1
The value 0 is reverts to 'heuristic.'
Using the value cacheCount.activeIndex=2, enaio® documentviewer can be set so that it uses the size and free storage space of the underlying hard drive to calculate the cache size instead using the cache itself. This enables it to be more responsive, especially for large and rapidly changing caches. It should be noted that other activities that change the size of the hard drive’s utilized storage space may influence or distort the results. Therefore, when using this option, it is recommended to transfer the cache to a separate hard drive or partition.
Determining the Content Type
The extensionmapper.properties configuration file located in the \renditionplus\bin\custom\ directory for mapping the file extensions for content type has the highest priority. This makes it possible to define user-specific content types based on file extensions and to provide targeted custom converters for these content types.
Examples:
tff=image/tiff
adf=custom/compound
Renditions from External Sources
enaio® documentviewer provides REST endpoints that external rendition creators can use to file their renditions in the Documentviewer cache, thereby reducing the load on the system:
Post/ Get: /osrenditioncache/app/api/dms/{id}/contents/renditions/{type}
Minimum Required Text Length
The minimum required text length that is used to detect whether a text extract is valid and sent to the OCR can be changed via the config.properties configuration file located in the <installdir>\webapps\osrenditioncache\WEB-INF\classes\config\ directory.
rendition.textExtraction.minTextLength=5
File Formats for OCR
The file formats that are to be sent to the OCR can be specified via the config.properties configuration file located in the <installdir>\webapps\osrenditioncache\WEB-INF\classes\config\ directory. Default:
rendition.ocrSelectionPredicate=image/tif,application/pdf
Example: rendition.ocrSelectionPredicate=image/tif,application/pdf,image/png
Maximum File Size
The maximum size of files processed can be specified via the config.properties configuration file located in the <installdir>\webapps\osrenditioncache\WEB-INF\classes\config\ directory. Default:
system.maxFileSizeForObjectInMB=1024
If an object has multiple content files, then this value applies to the sum of the files.