Reveal Help Center

Data sources

The Data sources section allows you to upload, connect, and process documents & tags to your storybook.

60638bf3d34ae.png
  1. Import SQL data to your storybook

60638bf5d322d.png

Click the SQL server button to add a SQL connection.

Name your SQL server data source, select the server, database, table, and map fields.

60638bf7bc2cf.png
  • Naming

    • Name: Give your data source a name that you can use to reference in Reveal AI. We recommend naming it with descriptive information like server, database and table name so you can easily identify it.

    • SQL mapping: you can map all of your database fields from scratch by selecting “Custom” or save time by loading mapping selections used previously.

  • SQL connection

Select a source server you wish to read from. If your desired server is not available in the drop-down, go to Server connections under Tenant Settings to connect it.

Choose the database and table from which you wish to read your source data.

  • Database fields

Before processing begins, a SQL table with required metadata fields must be created. Only fields with compatible field types will be displayed as mapping options. Asterisked fields are required for processing.

ID

Field Name

Field Type

Field description

1

Id*

Number

Numeric identifier of the document. For Relativity, use ArtifactID

2

Control Number*

Text

Text identifier of the document

3

Custodian*

Text

Name of the custodian. If unavailable, processing will default empty values to “Empty_Custodian”

4

MD5 Hash*

Text

MD5Hash value, default to Control Number if empty

5

Group Identifier*

Text

Used to group email and attachments together; default to Control Number if empty. Notice for parent, the Group Identifier must be equal to the Control Number

6

Process Status*

Number

Indicate Reveal AI processing status

7

Extracted Text*

Text

Path to the text file exported, or text content itself

8

Email From

Text

Field only relevant to emails

9

Email To

Text

Field only relevant to emails

10

Email CC

Text

Field only relevant to emails

11

Email BCC

Text

Field only relevant to emails

12

Email Subject

Text

Field only relevant to emails

13

Primary Date Time

DateTime

Date and Time the email was sent, or last modified date time for attachments & e-files. It is important to include the “Time” part for this field.

14

Email Time zone

Text

The time zone with which the document should be processed. Most common values: EST/CST/PST/GMT

15

Email Attachment Count

Number

Field only relevant to emails. Number of attachments the email has.

16

eFile Author

Text

Last Author of the efile or attachment, if it exists

17

eFile Filename

Text

Filename of the efile or attachments

The Processing data settings page shows you the steps the system will take once processing begins. There are several settings associated with the computer-generated content detection, ingestion, natural language pipeline and pattern detection steps that you may modify. You may also modify the processing XML to include any custom settings that may be desired.

An Email notification option has been added, where if checked all data exports will automatically trigger a notification.

Note

If you are processing a Relativity workspace, the Reveal AI ADS package will automatically create a “NexLPProcessingStatus” field in the workspace. Before processing data, update Process Status field to 0 for all documents to be processed. For a list of all possible values for this field, refer to Appendix A. Make sure that the processing status field value for all documents that do not need to be processed has a Null value or a value greater than 5. If there are any records with processing status of 4, reset their processing status to 0. Process Data with any combination of entity models including the multiple entity Standard Model or any available from the Model library.

60638bf9eb912.png
  • Computer-generated Content Detection: click the Show options button to view and set options for computer-generated content detection:

    604232d12b574.png

Under Ingestion:

  • Default Time zone: The “Default time zone” option tells Reveal AI which time zone to use when processing email text if the Time Zone information is missing in the header lines.

  • Date Format: The system will prioritize processing using either UK or US date formats based on user selection for this option.

Under Natural language pipeline:

  • NER for eFiles: By default, Reveal AI only extracts Entities for emails. The “NER for eFiles” option allows user to extract Entities from loose files and attachments. Notice this could increase time needed for processing.

  • Entity models to run on data: Any models may appear by default according to your configuration. You may turn any model off by unchecking the associated checkbox. By pressing the Model library button you can view the list of additional available models as shown below. You may run any combination of models or none at all.

    60638bfd4d4a5.png

Under AI vector generation:

  • AI vector generation: By default you will build COSMIC vectors. You may optionally decide to build NexBERT vectors which is required if you are going to use NexBERT models.

    604232d4bfe93.png

Under Near duplicate document detection:

  • Check the Near duplicate detection box to run the process Near duplicate documents will be grouped and similarity scores calculated.

Under Pattern detection:

  • Patterns: By default, Reveal AI processes Patterns. If Pattern Detection is not required, uncheck the “Patterns” checkbox.

Before kicking off processing, the system will confirm the number of documents you will be processing and undergo final pre-flight checks to ensure the data is in good shape to process.

60638c0099586.png

If your data isn’t set up correctly you may get a warning in this screen.

Note

Processing will kick off regardless of any warnings.

604232d7dd042.png There are no documents ready for processing 604232d90bddf.png

This is the number of documents with a process status=0. If this is not the number you were expecting, please check your data source and update the process status field.

604232d7dd042.png 0 of X processing services are available 604232d90bddf.png

Your processing sevice is stopped. If you click Start your processing job will be added to the queue, but the processing service will need to be launched before this one begins.

604232d7dd042.png No service found with POS Tagger API configured 604232d90bddf.png

Your POS Tagger sevice is stopped. If you click Start, your processing job will be added to the queue, but the POS Tagger service will need to be launched before this one begins.

604232d7dd042.png No service found with NER configured 604232d90bddf.png

This data source was configured to run with entity extraction but no service was found with Named-Entity Recognition (NER) configured. It is recommended that this be configured to ensure actual results.

604232d7dd042.png No service found with DCI configured 604232d90bddf.png

This data source was configured to run with dataless classification but no service was found with DCI configured. It is recommended that this be configured to ensure actual results.

604232d7dd042.png Documents may not have attachments 604232d90bddf.png

Column “Groupidentifier” in Table “EDDSDBO.Document” of Database “EDDS1015555_Test” contains at least one blank or null value. Column “Groupidentifier” values will be replaced with values from Column “ControlNumber” for rows where this occurs. When this occurs, documents will not have any associated attachments.

604232d7dd042.png Documents may not have valid md5 value 604232d90bddf.png

Column “MD5Hash” in Table “EDDSDBO.Document” of Database “EDDS1015555_Test” contains at least one blank or null value. Column “MD5Hash” values will be replaced with values from Column “ControlNumber” for rows where this occurs.

604232d7dd042.png Documents may not have some necessary indexes 604232d90bddf.png

Table “EDDSDBO.Document” of Database “EDDS1015555_Test” does not have an index which contains the column Column””.

Create Missing Index

Once you click the Start button processing will kick off immediately. You will be able to check the status of processing in the Notifications center and/or storybook dashboard. There will be a square STOP button to the right of the Notification entry until processing is completed.

  • Notifications center:

60638c0c513d4.png
  • Storybook dashboard view

If you decide to click the Stop button while data is processing, your progress will be saved. You can resume by going to the Data sources page and clicking the triangle icon under the “Process” column as shown below.

60638c0e5db92.png
  1. Import from Exchange/Office 365

To import from Exchange or Office 365 choose the Exchange/Office 365 button:

60638c1035c49.png

And this window appears:

60638c1211c70.png

The fields for adding an Exchange or an Office 365 data source are:

  • Data source name (required): Choose an appropriate name.

  • Storybook (required): Automatically populated.

  • Control Number Prefix (required): Some descriptive title which will appear in front of the control number.

  • Server Connection: Choose from the dropdown list of connected data sources. The connected data sources choices can be configured as described in 2. Tenant Settings > F. Server Connections > 4. Add Exchange/Office 365 Server Connection.

  • Schedule: Either “On demand” which will initiate import when you complete configuration, or “Daily” according to whatever hour you specify.

  • Email notification: If checked all data exports will automatically trigger a notification.

When “Server Connection” is chosen, a checkbox list of mailboxes appears:

60638c142d535.png

Check any or all mailboxes from the list under Data and then choose Save and Continue.

60638c163d1c2.png

The following page appears:

60638c181892f.png
60638c1a257e0.png

The Processing data settings page shows you the steps the system will take once processing begins. There are several settings associated with the computer-generated content detection, ingestion, natural language pipeline, and pattern detection steps that you may modify. You may also modify the processing XML to include any custom settings that may be desired.

  • Computer-generated Content Detection: Click the Show options button to view and set options for computer-generated content detection:

    60638c1c210a3.png
  • Under Ingestion:

    • Default Time zone: The “Default time zone” option tells Reveal AI which time zone to use when processing email text if the Time Zone information is missing in the header lines.

    • Date Format: The system will prioritize processing using either UK or US date formats based on user selection for this option.

  • Under Natural language pipeline:

    • NER for eFiles: By default, Reveal AI only extracts Entities for emails. The “NER for eFiles” option allows user to extract Entities from loose files and attachments. Notice this could increase time needed for processing.

    • Entity models to run on data: Any models may appear by default according to your configuration. You may turn any model off by unchecking the associated checkbox. By pressing the Model library button you can view the list of additional available models as shown below. You may run any combination of models or none at all.

      60638c1e8a707.png
  • Under AI vector generation:

    • AI vector generation: By default you will build COSMIC vectors. You may optionally decide to build NexBERT vectors which is required if you are going to use NexBERT models.

      604232d4bfe93.png
  • Under Pattern detection:

    • Patterns: By default, Reveal AI processes Patterns. If Pattern Detection is not required, uncheck the “Patterns” checkbox.

Before kicking off processing the system will confirm the number of documents you will be processing, and undergo final pre-flight checks to ensure the data is in good shape to process.

60638c21cb122.png

If your data isn’t set up correctly you may get a warning in this screen.

Note

Processing will kick off regardless of any warnings.

604232d7dd042.png There are no documents ready for processing 604232d90bddf.png

This is the number of documents with a process status=0. If this is not the number you were expecting, please check your data source and update the process status field.

604232d7dd042.png 0 of X processing services are available 604232d90bddf.png

Your processing sevice is stopped. If you click Start your processing job will be added to the queue, but the processing service will need to be launched before this one begins.

604232d7dd042.png No service found with POS Tagger API configured 604232d90bddf.png

Your POS Tagger sevice is stopped. If you click Start, your processing job will be added to the queue, but the POS Tagger service will need to be launched before this one begins.

604232d7dd042.png No service found with NER configured 604232d90bddf.png

This data source was configured to run with entity extraction but no service was found with Named-Entity Recognition (NER) configured. It is recommended that this be configured to ensure actual results.

604232d7dd042.png No service found with DCI configured 604232d90bddf.png

This data source was configured to run with dataless classification but no service was found with DCI configured. It is recommended that this be configured to ensure actual results.

604232d7dd042.png Documents may not have attachments 604232d90bddf.png

Column “Groupidentifier” in Table “EDDSDBO.Document” of Database “EDDS1015555_Test” contains at least one blank or null value. Column “Groupidentifier” values will be replaced with values from Column “ControlNumber” for rows where this occurs. When this occurs, documents will not have any associated attachments.

604232d7dd042.png Documents may not have valid md5 value 604232d90bddf.png

Column “MD5Hash” in Table “EDDSDBO.Document” of Database “EDDS1015555_Test” contains at least one blank or null value. Column “MD5Hash” values will be replaced with values from Column “ControlNumber” for rows where this occurs.

604232d7dd042.png Documents may not have some necessary indexes 604232d90bddf.png

Table “EDDSDBO.Document” of Database “EDDS1015555_Test” does not have an index which contains the column Column””.

Create Missing Index

Once you click the start button, processing will kick off immediately. You will be able to check the status of processing in the notification center and/or storybook dashboard.

  • Notifications center:

60638c2c76aee.png
  • Storybook dashboard view

If you decide to click the Stop button while data is processing, your progress will be saved. You can resume by going to the Data sources page and clicking the triangle icon under the “Process” column as shown below.

60638c2ea74e0.png
  1. Import documents from Relativity

Please see Section 4. Integration with 3rd Parties > A. Relativity > 2. Import Documents from Relativity in this Admin Guide.

  1. Import tags from delimited files

Choose the Delimited file option to import user tags or COSMIC tags to your existing documents.

60638c30bb946.png

Enter a data source name and upload your file. Note that the file must be comma-delimited.

The file you upload must contain:

  • One column with control number.

  • Other columns with COSMIC or external tags.

    • COSMIC tags must be labeled Yes, No or Skip. This is the indicator at the document level.

    • External tags may have any value.

60638c32e17a1.png

Batch options: Batch options are by default 10,000 documents with a timeout of 60 seconds, and this should work for most exports. However, you can change the settings if you are having issues with delays during export.

If your file contains headers in the first row, check “First row contains headers.”

Once your file is successfully uploaded you will see the following:

60638c34f32fa.png

Map each field you wish to import using the field mapping tool.

60638c36da429.png

Select a Reveal AI field to which you wish to map your choices. If you would like to create a new field, click Create new field.

The following appears:

60638c38b7596.png

Choose a conflict resolution to tell the system what you want it to do if there are existing values that conflict with the newly uploaded values.

60638c3ab4487.png

Skip: Don’t import the value.

Merge: Keep both the existing and imported values.

Overwrite: Replace the existing value with the imported value.

Once you begin processing:

  • A “Process data” task will begin running in the Notifications Center.

    60638c3c9980b.png
  • You can track the progress in the Storybook/ Admin dashboard.

    60638c3e674ea.png
  • Your delimited file data source will appear in the Data Sources list with type “Delimited file”.

    60638c4013d7a.png

After processing has completed, you should expect the following:

EXTERNAL TAGS: Your tags will now be available in search:

6042331bd3f5a.png

Note

New external tags will not be available on the document level as a user tag.

60638c4299bf0.png

COSMIC TAGS: Correct tag (Yes / No / Skip) will be selected in the thread viewer.

Note

You cannot delete tags after they have been imported.

  1. Import tags from Relativity

Please see 4. Integration with 3rd Parties > A. Relativity > 3. Import Tags from Relativity in this Admin Guide.

  1. Relativity COSMIC Monitoring

Please see 4. Integration with 3rd Parties > A. Relativity > 6. COSMIC Monitoring in this Admin Guide.

Important

Under the new Relativity API, COSMIC Sync will not be available to users connected to RelativityOne. For these users COSMIC monitoring is recommended for tag synchronization.

  1. View data sources

After creation of a data source, new data sources will be available in the Connected data sources section.

Note

Data sources that appear in this list can be in all stages of processing.

  1. Clicking the 6042331edca40.png button under the “Process” column will:

    • Kick off processing if processing has not yet occurred.

    • Resume processing if it had been previously started and stopped.

    • Re-process if processing has already completed.

If you re-process, documents in your storybook will not be affected. Rather, any additional documents added to the data source will be processed.

  1. Clicking the 6042331ff2d9d.png button allows user to edit properties of the data source. User can re-map field, re-name data source by using the edit function. Edits will be picked up upon the next processing.

  1. Clicking the 60423321130af.png button allows user to delete the data source.

    Note

    Deleting the data source will not delete documents from the storybook. In order to delete documents permanently, follow the instructions under Section 1.H Document deletion.

If attempting to re-process, edit or delete a data source while processing is in progress, you will see a warning message:

6042332278d85.png

Click on a data source...

60638c472339d.png

...to see all completed, errored, stopped, or running processes.

60638c492419a.png

Expand details on each task to view & download logs.

  • Summary: provides processing results in detail.

  • Processing: includes processing exceptions.

  • Performance: includes performance metrics.

60638c4b07deb.png

The Error Report highighted above in red is described in Appendix E. New “Error Report” Function in Front-End/NPA.

If your memory usage exceeds 99% you will see a Low memory warning. It is recommended to stop processing, reduce the number of threads in the service configuration and restart processing.

60638c4cecb06.png