Image and Index PDFs

A How-To Recipe
October 16, 2006
SQI Staff

Problem

Office productivity tools such as word processors, spreadsheets and presentations generate extensive and valuable material in every organization. PDF is a common distribution format for these productivity documents, and documents not already in PDF format can easily be converted and published as PDFs.

In many cases, the content of the PDF is frequently accessed information. Viewing these PDF documents directly from the browser, without having to download them, is a much more effcient method of accessing their information. In order to do this, it is necessary to move the documents to the Knowledge Center, have the contents displayed directly to the browser and have them indexed so that they can be searched using KnowledgeDex.

Or, stated in CIE terms, how do I publish a PDF to the Knowledge Center and have the document contents added the to KnowledgeDex index.

Solution

The SQI Collaborative Knowledge Base provides PDF publishing capabilities. The PDF document is linked for download. It is imaged in HTML for quick web browsing and is indexed by KnowledgeDex.

PDF Imaging Steps:

  1. Create New Page
  2. Upload PDF file
  3. Edit the DisplayPDF Pluggin

Each step is presented below.

1. Create New Page

The first step is to create a new page that will contain the PDF document. To do this, go to the page that will be the parent of the new page. Edit this page to create the link for the new page. For example:

[:NewPage: Title of New Page for PDF Document]

Save the edit and click on the new link (displayed in red becuase the page does not yet exist). Select Create blank page option from the new page dialog box. Then copy the following and paste into the new page.

= Title =

[[DisplayPDF(src=attachmentName, dpi=72, pages=5, title=TitleGoesHere)]]

Save page.

2. Upload PDF file

Click on Attachments at bottom of topic page. Click on Browse and then select the PDF to upload. Click on the Upload button`.

Attached.png

Figure 1: Highlighted attachment name

3. Edit the DisplayPDF Pluggin

Copy the attached file name (highlighted in the above figure), click on Edit(text) and paste the attachment name into the DisplayPDF pluggin right after src=, also put the PDF title right after title= (note this is not in quotes). See example below.

= Chapter 1 =

[[DisplayPDF(src=Chaptter-1-v6.pdf, dpi=72, pages=99, title=Chapter 1)]]

The default is for the DisplayPDF to image 5 pages. In the example we want all the pages imaged so 99 replaces the 5 right after pages=.

Save the page.

More Discussion

Indexing

On the next scheduled KnowledgeDex indexing cycle all the words withhin the PDF will automatically be added to the index.

More detailed on Options

The standard form and default values of the DisplayPDF pluggin are shown below.

[[DisplayPDF(src=attachmentNane.pdf, dpi=72, quality=png, pages=5, title=pdfTitle)]]

The options (parameters) for the DisplayPDF plugin are:

  • src - The PDF file to display and index. The file must be an attachment on the current page.

  • dpi - Dots Per Inch. How many pixels for each inch on a page. Default = 72. At 72dpi with a page that is 8.5x11 inches, the full image size will be 612x792.

  • quality - Currently, this only supports png. May omit. Default = png

  • pages - The maximum number of pages to process. Default = 5. Note: If more than the specified number of pages were processed before (such as lowering number of pages), you must delete the *_cache.xml file, otherwise, previous pages will also show.

  • title - The title to present to the user. Note that this can not contain any commas. Default = untitled

Two common changes to the default values are:

  • pages=999
    If you want all the pages of an attachment larger that the default, the default of 5 is replaced with a large number such as 999.

  • dsp=96
    If the PDF contains graphics that do not display clearly with the default increase the resolution to 96 dot per inch. This makes the image of the PDF larger which increase the clearity of graphics. It will also increase the time required to display the page. For really fine detail the PDF is available for download.


Univ/CIE/KA/PdfPlugginForImageIndex (last edited 2015-03-06 18:11:25 by localhost)