inbounddoc

Top  Previous  Next

obj=new("inbounddoc",sourceid$,doctype$,docid$)

 

The inbounddoc object provides methods to manage a single document in an inbound library.  It provides identification properties and some image manager-specific values as properties rather than methods, plus offers simpler syntax for some additional methods mirrored in the inbound object.

 

Properties

 

exists is set to true (1) if the document specified in the new() instantiation function exists.

ibsrc$, ibdoctype$, ibdocid$ are three properties that identify the document in the inbound source library.  They are given these property names to distinguish them from the library$, doctype$, and docid$ properties that are used when the document is transferred to a document archive library.

invaliditems$ is a list of field and identification names and associated error messages.  It should not be maintained directly, but rather through the initinvaliditems() and iteminvalid() methods.   The list is in tab-separated-values format,

jobid$ is the job id of the last job run against this document.  Additional details about the job can be accessed in the browser interface.

jobname$ is the name of the job assigned to this document, through Image Manager operations or directly via assignment of this property. Note that assigning the property does not run the job.

user$ is the UnForm user id assigned to this document.  This should be an image manager user.  Documents assigned to a user show in the Image Manager pending list for that user.  If user$ is null (""), it is available for self-assignment.

The following additional properties can be assigned as identification and standard archive properties.

library$
doctype$
docid$
subid$
subtitle$
title$
entityid$
date$ (should be in yyyymmdd format)
categories$ (pipe-delimited segments, semi-colon delimited category indexes)
keywords$ (semi-colon delimited words)
links$ (semi-colon delimited urls or pipe-separated library|doctype|docid[|subid] strings)

 

 

 

Methods

 

assignto([,userid$]) assigns the document to userid$, or de-assigns it if no userid$ is supplied.  De-assigned documents are in the pool of available documents that image manager users can self-assign.

bcd$(x,y,w,h[,page|pages$][,symbology$]) returns a barcode value found on the specified page(s) on the coordinates supplied (in inches).  If no page is supplied, 1 is assumed.  If a string of pages is supplied, such as "1-5", or "1,3,5", or "first-{last-1}", multiple values separated by [page] headers are returned.  Optionally specify a symbology that the barcode must be, such as code39, code128, or qrcode.

delfield(name$) removes a specific custom data field from the document.

getfield$(name$) returns the custom field data named name$.

getfield(name$,value$) fills value$ with the custom field name$ data.

getfields$() returns a tab-separated values list of custom field names and values.

getmeta$(name$) returns the metadata value of name$.  Metadata items are read-only values created at the time a document is imported from a source, such as subject and from addresses of email sources, or file names of directory sources.  Each name is prefixed with "@".

getmeta(name$,value$) fills value$ with the metadata value for name$, such as @subject, or @filename.

getmetas$() returns a tab-separated values list of metadata names and values.

getpages() returns the number of image pages in the document.

gridzone$(coldefs$[,pages$|page]) returns a tab-delimited-values list of OCR word data from the coldefs$ specification.  If no page is supplied, 1 is assumed.  If a string of pages is supplied, such as "1-5", or "1,3,5", or "first-{last-1}", columns contain [page] headers.  The coldefs$ variable must be a tab-separted-values where each column is defined as a name, position, pages, and filter.  Position is a comma-separated list of left, top, width, height in inches.  Column filters are applied.

initinvaliditems() initializes all validation error messages for the document.

iteminvalid(item$,errmsg$) sets a validation error message on a field or indentification name, using the item naming syntax of "field.name" or "ident.name".

ocr$(x,y,w,h[,page|pages$][,trim]) returns the OCR word values found on the specified page(s) on the coordinates supplied (in inches).  If no page is supplied, 1 is assumed.  If a string of pages is supplied, such as "1-5", or "1,3,5", or "first-{last-1}", multiple values separated by [page] headers are returned.

 

This method is used by auto-generated code for ocr zone regions in Image Manager jobs, but can be useful outside of automatic zone assignments.  For example, to convert text on page 1 to keywords, custom script code like this could be used:

 

doc'keywords$=texttokeywords(doc'ocr$(0,0,8.5,11,"1"))

 

ocrconfidence(x,y,w,h[,page|pages$]) will return the lowest OCR confidence value for words in the region specified, a value from 1 to 100.  If no values are available, this returns 100.  Confidence values are returned by an UnForm OCR service.

putfield(name$,value$) updates a custom field name$ with the supplied value$.

removepage(page) removes the specified page number from the document, shifting later pages up.  Returns 1 if successful.

reversepages() reverses the page order of the document.

rotate(degrees) rotates all pages and re-processes them for text and page extraction.

splitdoc$() splits all images of a document into new single-page documents.  It returns a tab-separated-values list of the added doc types and doc ids (note no trailing linefeed character).  All document data and field properties are duplicated.

splitdoc$(atpage) splits the document into two, with pages starting with atpage removed from the first document and added to the second.  It returns the added doc type and doc id, separated by a tab character.  All document data and field properties are duplicated.

splitdoc$(zonename$[,jobname$]) splits the document at pages where the specified zone value changes.  If no jobname$ is specified, the job currently assigned to the document is used.  Both OCR and barcode zone types are supported.  If the zone returns a value, and that value differs from the previous page, the document is split at that page.  The function returns a tab-separated list of added doctypes and doc ids (note no trailing linefeed character).  All document data and field properties are duplicated, so this process is normally followed by job execution on the existing and new documents in order to recalculate the data.

validate() performs all validation tests on the document's identification data and fields.  If there is a job assigned to the document, its validation rules are used.  Otherwise, the minimum validation required is performed, requiring library and doc type data.

zoneconfidence(zonename$[,jobname$]) returns the lowest OCR confidence value for words in the named zone, from 1 to 100.  If no jobname$ is specified, the job currently assigned to the document is used.  If no values are available, this returns 100.  Confidence values are returned by an UnForm OCR service.  If  If zonename$ is null (""), all zones are checked.