|
|||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||||
java.lang.Objectorg.pdfbox.searchengine.lucene.LucenePDFDocument
This class is used to create a document for the lucene search engine. This should easily plug into the IndexHTML or IndexFiles that comes with the lucene project. This class will populate the following fields.
| Lucene Field Name | Description |
| path | File system path if loaded from a file |
| url | URL to PDF document |
| contents | Entire contents of PDF document, indexed but not stored |
| summary | First 500 characters of content |
| modified | The modified date/time according to the url or path |
| uid | A unique identifier for the Lucene document. |
| CreationDate | From PDF meta-data if available |
| Creator | From PDF meta-data if available |
| Keywords | From PDF meta-data if available |
| ModificationDate | From PDF meta-data if available |
| Producer | From PDF meta-data if available |
| Subject | From PDF meta-data if available |
| Trapped | From PDF meta-data if available |
| Method Summary | |
static Document |
getDocument(File file)
This will get a lucene document from a PDF file. |
static Document |
getDocument(InputStream is)
This will get a lucene document from a PDF file. |
static Document |
getDocument(URL url)
This will get a lucene document from a PDF file. |
static void |
main(String[] args)
This will test creating a document. |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Method Detail |
public static Document getDocument(InputStream is)
throws IOException
is - The stream to read the PDF from.
IOException - If there is an error parsing or indexing the document.
public static Document getDocument(File file)
throws IOException
file - The file to get the document for.
IOException - If there is an error parsing or indexing the document.
public static Document getDocument(URL url)
throws IOException
url - The file to get the document for.
IOException - If there is an error parsing or indexing the document.
public static void main(String[] args)
throws IOException
args - command line arguments.
IOException - If there is an error.
|
|||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||||