Reveal Help Center

Creating Document Text Sets

Text Sets are administered in the Reveal Review Manager under the Project Setup panel, using the Text Sets link. Text sets are searchable text groups defined by import stream, for example extracted text, optical character recognition (OCR), translation or transcription. Each may be indexed separately and have its parameters, including maximum document size and edited common words lists. The sets t be imported and indexed are selected during Document import.

603d07e936b13.png

Default text sets in a newly-created Reveal Project are:

  • Native / HTML - Extracted HTML from native files.

  • Extracted - Extracted text from native files, such as Word documents, email messages, PowerPoint slides or Excel spreadsheets.

  • OCR / Loaded - Text loaded from a file or from OCR text documents accompanying images.

  • Transcription - Default text set for audio/video transcriptions.

Additional text sets may be added for Translations, for Manual OCR, or for sets of documents requiring a specially-defined Common Words list.

To create a text set,

603d07eb6bd57.png
  • Open Text Sets in the Project Setup pane within Reveal Review Manager.

  • Click the New button. You will then be presented with a number of items to configure for the new text set:

    • Name is how the text set will be referenced in Reveal in areas such as indexing, searching, and during review.

    • Description allows you to set a more descriptive name for the Text Set for documentation.

    • Enabled controls whether the Text Set will be available for use.

    • Load Field is the field used to link text or native paths for indexing.

    • Analyzer is the text analyzer that is used on the extracted text prior to indexing. This should be set to the expected source language for your documents.

You can now control the various indexing size limits on a Text Set level, instead of across an entire case.

The Common Words set for an index can also be customized on the Common Words tab. Common Words are those deemed too pervasive in the language for useful indexing and searching; circumstances of the case may move a Project Manager to modify this list. For example, in a securities matter you may wish to remove the word "put" from the Common Words list so that this kind of action can be searched, if not for the entire case, then a set of document supplied by a key Custodian.

603d07ed4f2a3.png

Caution

If you update the Common Words list you must delete the current indexes and then reindex to reflect the changes.