CmsExtractorPdf (OpenCms Core API, version 20.0)

java.lang.Object

org.opencms.search.extractors.A_CmsTextExtractor

org.opencms.search.extractors.CmsExtractorPdf

All Implemented Interfaces:: I_CmsTextExtractor

public final class CmsExtractorPdf extends A_CmsTextExtractor

Extracts the text from a PDF document.

Since:: 6.0.0

Method Summary

Modifier and Type

Method

Description

I_CmsExtractionResult

extractText(InputStream in)

Extracts the text and meta information from the document on the input stream.

static I_CmsTextExtractor

getExtractor()

Returns an instance of this text extractor.

Methods inherited from class org.opencms.search.extractors.A_CmsTextExtractor
combineContentItem, extractText, extractText, extractText, extractText, removeControlChars

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Method Details
- getExtractor
  
  public static I_CmsTextExtractor getExtractor()
  
  Returns an instance of this text extractor.
  
  Returns:
  
  an instance of this text extractor
- extractText
  
  public I_CmsExtractionResult extractText(InputStream in) throws Exception
  
  Description copied from interface: I_CmsTextExtractor
  
  Extracts the text and meta information from the document on the input stream.
  The encoding of the input stream is either not required (the document type may have one common default encoding) or the extractor is able to divine the encoding from the provided input stream automatically.
  Delivers is the same result as calling I_CmsTextExtractor.extractText(InputStream, String) when String == null.
  Specified by:
  
  extractText in interface I_CmsTextExtractor
  
  Overrides:
  
  extractText in class A_CmsTextExtractor
  
  Parameters:
  
  in - the input stream for the document to extract the text from
  
  Returns:
  
  the extracted text and meta information
  
  Throws:
  
  Exception - if the text extration fails
  
  See Also:
  
  I_CmsTextExtractor.extractText(java.io.InputStream, java.lang.String)