I still recall my undergrad days in Amsterdam, Netherlands in the early 1990s where large-scale document digitization projects were common in Europe. Many large scale enterprises were in full swing to digitize as much paper as possible to become ready for the 2020 digital initiatives to enable organizations to work in a completely digital environment, eliminating file cabinets, reduce the risk of losses (both documents & financial) or misfiling and consolidate processes.
At the same time as applications in the cloud, mobile solutions became pervasive, the business world became ripe for digital disruption and digital transformation and consequently the paper filing began to decline with big businesses.
Resulting in many mass scanning and categorizing initiatives of documents, files, vendor invoices, registration documents, human resources files, meeting and board minutes, land and title documents, engineering drawings, and many other document types.
Now in 2018, and I’ve been working as a Document Automation Consultant with Connectis Group for the last seven years and have had significant experience and knowledge to share how to plan a digitization project (turning paper into machine-ready data) and provide the accurate costs of a digitization project and to ensure that the deliverable quality data and documents meet client quality requirements.
The initial plan must define which documents / files / folders (and quantity of files, pages and images) need to be scanned, how they need to be consumed down the line of the document process, what file formats are required and what information needs to be integrated with other corporate internal applications, how much storage the documents and their indexing will require, the costs associated with scanning, followed by thorough quality checks and final release of the digitized documents.
Factors to take into account in calculating cost include:
- One- or two-sided documents
- Color or black and white
- Office documents (print), photographic data or a combination
- Number of index fields per page, per document (multiple pages), per folder
- Resolution of scanning (e.g., 300 dpi as office standard or 600 dpi for higher resolution)
- Whether optical character recognition (searchable text version) of the document will be required or if only graphic image (e.g., like a fax) will be used.
Your documents should be prepared for efficient high-speed scanning. This manual activity also adds to your cost for digitization. Here’s some things to consider during the document prep phase:
- Envelope bursting (if any)
- Removal of paper clips and staples and a way to keep multiple-page documents together (e.g., use of separator or index sheets in front of each new document)
- Mounting of odd-sized notes (e.g., sticky notes) on standard-sized pages
- Removal documents from paper folders
- Capture of information recorded on the folder tab or inside of the folder
- Seals (impressions on paper) are often shaded with a pencil to highlight the seal on scanning
At Connectis Group, we use a variety of scanning hardware and software solutions to generate a digital image for each sheet of paper as well as capture index document data (metadata) in a data format that can be used to trigger downstream business processes. Once captured, the document can be automatically named based on some document field data, for instance, based on the vendor name or invoice number etc. or follow other standard corporate naming conventions, and potentially route images and document outputs to document repository systems including ECM/EIM/ DMS, etc with further predefined naming convention and indexing.
The indexing is a crucial part of the process and the amount of indexing impacts the cost of digitization. Some index information may be captured electronically (e.g., through optical character recognition) or system-supplied data using database lookups and others. For example, if an employee name and number already exists in a database, capturing only the employee number would allow for extracting name and address to use as indexes from other systems. These indexes are often referred to as metadata and should be embedded in the actual document. Indexing to the folder, document, or page level or to any 3rd part application is another possibility where we can automatically send a data stream of metadata to (almost) any application or database and have the folders and files created and named based on your company naming conventions.
We also provide secure cloud access (using access control and user access management tools) to all the digital files via secure https:// connection to our Document Management System and can maintain the life cycle of your digital information based on defined records management and retention policies.
After uploading and testing of the digitized files, a decision needs to be made as to the handling of the originals. Most of our scanning projects do not require re-clipping the original documents and contents in the folder but leave them as is and set a disposal date for the originals (shredding) so that the digitized version serves as the official legally binding record.
If you’re considering a new digitization project, we can put together a custom proof of concept to sort out all of the details prior to entering into a large scanning contract. Together, we’ll select physical document samples, complete the work needed to prepare and process the files or documents for scanning. Develop the project tracking for the file folder and document id number counting both the number of pages, the number of images (for two-sided pages), and the index terms for the folder and document. When the digital images are generated, we’ll review the images and verify that the number of documents, pages, and images are the same as that recorded on the original log sheet and that the sequence of pages are identical to the original. We don’t proceed with the full implementation until the accuracy has reached your requirement. This phase is critical in making sure the final digitized images are usable, the metadata is captured accurately, the indexing is meeting your criteria and have the integrity of the original file.
So if your journey towards digital transformation has taken you to investigate how to improve your document processes – call us at 905-695-2200 or email us at firstname.lastname@example.org