Reveal Help Center

Reveal-Brainspace Connector Guide

Summary

The integration between Reveal and Brainspace makes use of several components. The main component is a Java plug-in written by Reveal using an SDK provided by Brainspace. The plug-in drives some of the Brainspace UI functions and communicates to the Reveal REST API to perform database functions. Reveal also communicates directly to the Brainspace REST API to add functionality not present in the plug-in SDK.

This document is based on Brainspace version 6.0.7-8. The preferred browsers are Chrome, Edge and Firefox 52+.

Adding a Connector

The first step in getting started is setting up a connector in Brainspace. This connects the Brainspace UI to the Reveal REST API.

Go to Administration->Connectors and hit the white +Connector button on the right-hand side. Select Reveal for the connector type.

image1.jpg
Connector Settings

Use the scrollbar on the right-hand side of the configuration popup to navigate the window.

Main Settings

The connector name and the URL to connect to the Reveal REST API must be specified.

The naming convention for the Reveal Rest API URL is:

https://<hostname>.<domain>/rest/api

image2.png
Overlay Settings
image3.jpg

The plug-in can automatically create fields in the Reveal case database (in the IMPORT_DOCUMENTS table) and populate the values at build time immediately after ingestion. You can also run the overlay as a separate step after build if desired.

The following is the complete list of fields:

BD EMT AttachmentCount

BD EMT DuplicateID

BD EMT EmailAction

BD EMT FamilyID

BD EMT Intelligent Sort

BD EMT IsDuplicate

BD EMT IsMessage

BD EMT IsUnique

BD EMT MessageCt

BD EMT ThreadHasMissingMessage

BD EMT ThreadID

BD EMT ThreadIndent

BD EMT ThreadPath Full

BD EMT ThreadSort

BD EMT UniqueReason

BD EMT WasUnique

BD EMT WasUniqueReason

BD ExactDupSetID

BD IsExactPivot

BD IsNearDupPivot

BD Languages

BD NearDupSetID

BD NearDupSimilarityScore

BD Primary Language

BD RelatedSetID

BD Summary

BDID

Here is an example of custom fields that were automatically created in Reveal.

image4.png
Predictive Coding Settings

When you create a new predictive coding classification in the Brainspace UI, the plug-in automatically creates the following items in the Reveal case database. You can set the names to use when creating in Reveal. Once predictive coding has begun, the names must not change.

image6.jpg
Advanced Connector Settings

There are some fields located under the “Advanced” link. Click the link to display them.

image5.jpg

The document ingest batch size is how many documents (field values and text) to pull at one time in a chunk over the Reveal API. A good default range is 200-500.

The document id batch size is used during ingest and syncing to return a list of document ids from Reveal. A good default is 10,000.

The overlay batch size is used when sending Brainspace field values to Reveal for overlay. A good default is 1000.

The key field can be either ITEMID or BEGDOC, it is used to sync docs between the systems. ITEMID is the more efficient choice.

Allow deletes on sync push will remove any docs from a work folder in Reveal that are no longer present in a notebook in Brainspace. This is also used in tag sync to un-tag documents that have been removed from a notebook. Note: This pertains to notebook syncing and not connected tag syncing.

You can click Test Connector to verify connection to the Reveal Rest API or just click Create Connector (this will also verify before it creates).

image7.png

The new connector won’t show any datasets yet because it hasn’t been associated with any. The same connector will be used for multiple cases to pull documents into datasets.

Setting Brainspace root folder in Reveal

Before creating a dataset in Brainspace, you should set the work folder root in Reveal Review for the case you wish to ingest. The setting is located under Admin->Settings->Brainspace. This will limit the number of work folders used within Brainspace in cases where there are a large number of top level work folders directly under the main root.

image9.jpg
image8.jpg

Brainspace US API Key: c2adcab9-68af-4caf-8287-940fa2cee569

Creating a Dataset for a Reveal Case

Once you have created a connector, the next step is to create a dataset. You would typically have one dataset for each Reveal case. From the connectors page, click the dataset button on the top left and then hit the white +Dataset button on the right-hand side. Name your dataset after a Review case click Create.

The groups are not controlled by Reveal; they are Brainspace groups that have access to the dataset. The empty dataset will be created and will expand automatically.

image10.jpg

Click Choose Connector under the middle data section and select a previously created connector.

image11.png

The first time you use a connector it will attempt to connect to the Reveal REST API to verify access. You must enter the username and password used to access the Reveal database.

The next step displays the available Reveal Review cases. Select one and click Save & Proceed.

image12.png

The next step allows you to pick the documents you wish to pull from the case in Review.

image13.png

You can pull from import batches and/or top-level work folders. The work folder root is configured under Admin->Settings->Brainspace in Reveal Review. The work folders available in the plug-in are controlled by the normal security methods in Reveal Review based on the user id used in the connector credentials.

The top two lists are multi select. Select any combination of items from one or both lists. You must select at least 300 docs to ingest.

Select only one field profile in the bottom list. The field profile should have been previously set up in Reveal Review and should include all fields required for analytics/email threading.

The next screen informs you of the count of documents in the selected items that will be ingested and the possible licensing impact. If you happen to select documents that have already been ingested, they will be re-ingested but they won’t count against the licensing. After confirming, the next step is the field mapping.

image14.jpg
Dataset field mapping

You can include whatever fields you like in the Reveal profile, including non-importable fields. There are a few fields that have special meaning in Brainspace and are recommended (these are shown below). The plug-in attempts to map all fields from the profile according to their data type or special meaning so you shouldn’t have to change the mapping you are presented with, just click Continue.

image15.jpg
Preview Field Mapping

If you would like to preview the values for the mapped fields before hitting Continue, you can click the little icon in the middle right. You will be presented with a popup that contains circle buttons at the bottom to navigate through five docs. Body text is not included in the popup; it is too long and not very useful in this case.

image17.jpg
Special Field Mapping

The following are fields of special significance:

Reveal Table Name

Brainspace

SUBJECT_OTHER

Maps to “Subject”. Used as the title in Brainspace so if the value is blank in Reveal Review, the Brainspace title will be "No title". It is possible to map more than one Reveal field to the same Brainspace field and the values should combine however this could interfere with Brainspace analytics.

SENDER

Maps to “From”.

RECIPIENT

Maps to “To”.

CC_ADDRESSES

Maps to “CC”.

BCC

Maps to “BCC”.

SENT_DATE, TIME_SENT

All known (non-custom) DATE, TIME pairs are combined into a single field value when ingested into Brainspace. You only need to include the date field in the profile, the time will append automatically.

ATTACHMENT_LIST

Maps to “Attachment”.

PARENT_ITEMID

Maps to “Parent ID”. Is also called “Parent_ID” in some Reveal databases.

CONVERSATION_ID

Maps to “Conversation Index”.

CUSTODIAN_NAME

Maps to “Custodian”.

ITEMID or BEGDOC

Maps to ID field. You are only required to map the field used as the lookup key in Reveal Review but best practice would be to include both. The ID field is the only required Brainspace field but ingest will fail if you are trying to ingest body text and no provided docs have text.

BODY_TEXT

This isn’t a real field in Reveal Review and won’t be in the Reveal Review profile, the plug-in adds it to pull document text from either ISYS or Elastic index. Early case assessment pre-index in Reveal Review is not available at this time.

Running Ingest/Build/Overlay

Once you close the field mapping, the dataset settings are displayed and the dataset will be in the “Prepared” state. Before build/ingest you can decide whether you want to overlay fields back into Reveal Review and after the build completes.

Click the Build button to start ingest.

image18.jpg

Note

The button to the right of the “Prepared” text allows you to come back and re-configure the sources for a dataset once it is created. The trashcan on the bottom lower left allows you to remove a dataset. When you delete a running dataset, it will first go to “Stopped” state and will not disappear right away from the list; you have to refresh the screen after a few seconds.

Build displays the following popup.

image19.jpg

Choose how quickly you want it.

image20.jpg

Once the build starts you will see progress on the far right hand side of the screen.

Progress by default is reported every 10K documents. To increase the report frequency add parameter ingest.batch.size to the Brainspace configuration file “brainspace.defaults.properties” located here: /var/lib/brains/.brainspace/

Example: ingest.batch.size=1000

image21.jpg

With ingest and build selected, the first stage pulls documents from Reveal Review and stores them in Brainspace.

Once ingest is complete, the build stage starts automatically.

image22.jpg

When the build completes, the area to the right will close and the dataset will be running and ready to go. If overlay was selected, you will see a message in the dataset area under admin that overlay is in progress. If it fails for some reason, an error message will be displayed.

Incremental Ingest

You can do an incremental ingest to add more docs to the same dataset. Re-open the dataset and click the button immediately to the right of the “Prepared” text to add more docs.

Notebook Syncing

This feature allows for pulling and pushing document ids between Brainspace and Reveal Review.

For pulling into Brainspace, you can select documents from a top-level work folder in Reveal Review or select documents with a certain tag. The work folders and tags available in the plug-in are controlled by the normal security methods in Reveal Review.

We aren’t going to do any actual tagging in Brainspace; we are just putting the documents into a notebook. In Brainspace 6.0, notebooks can have tags associated with them and you can use these to tag docs but there is no way in the plug-in to get the individual tags for a document back to Reveal Review.

Pushing will either add notebook documents to a work folder in Reveal Review or tag the documents in Reveal Review.

To setup sync you need to first create a notebook in Brainspace.

There are numerous ways to do this in Brainspace. Look for the icon image23.jpg.

A simple way to create a notebook is to click on “Notebooks” off the main screen and click the plus sign.

image25.png

If you select a set of data you can also select the Add Notebook from the results button bar.

image24.png

Be sure to make your Notebook “Public” so others can find it. Add any related items to the Notebook. Select “Create Notebook”

Once you create the notebook open it and you will see a Synchronize button on the right-hand side.

image26.jpg

First select the field type to synchronize, either work folder or tag.

image27.png

Then select the desired tag or work folder and click Next.

image28.png

You can select to either push or pull from Reveal Review.

For pushing you may see a message about concept metadata. For example, if you added a certain date range of documents to a notebook, the metadata may be present. There is no way to turn this message off in the plug-in. You can select “No” for this option.

image29.png

We do not currently send the additional metadata back to Reveal Review.

Syncing a small number of documents (1-200 or so) is almost instantaneous and the UI in Brainspace doesn’t always update. Sometimes it remains in “Synchronizing” mode. Continue to Refresh until synchronization is complete.

If you pushed documents, go to the work folder or tag in Reveal Review to verify. Select Refresh in the tree on the left to update the count. For tags, it is best to go to the admin node to verify. The documents will be tagged using the user id from the connector credentials.

Note

You cannot push tags for documents in Reveal that are locked for tagging. If you attempt to tag these in a sync push, they will not be tagged and the count of documents in Reveal Review will not match the notebook.

For pushing to Reveal Review there is an option in the connector configuration (shown earlier in this document) to remove work folder docs or un-tag documents that have been removed from the notebook.

Pulling into Brainspace is additive which means documents removed from Reveal Review are not removed from the notebook. To remove them, you would have to remove all docs from the notebook before re-syncing. Here again, the Brainspace UI does not always update correctly (as shown below) and you have to hit Refresh.

image30.jpg

Once you have setup sync for a notebook, it becomes “connected” and all you have to do to re-sync is click the Synchronized button again.

image31.jpg

In this case you will see a different popup.

image32.jpg

If you wish to sync to a different folder or tag you can disconnect and re-setup the sync.

Connected Tag Setup in Brainspace

Connected tags are another method to sync data from Reveal Review to Brainspace. You can connect a Reveal Review tag or work folder to a special type of tag in Brainspace. The main difference between normal notebook syncing and connected tags is connected tags are a one-way sync from Reveal Review to Brainspace (i.e. Reveal Review is the system of record).

image33.png

To setup a connected tag in Brainspace, go to admin area for datasets and click the tag button on a dataset.

In the popup, click Connect Tags.

You will be presented with another popup containing Reveal Review tags and work folders. Reveal Review supports 3 types of tag sets; multi-select, mutually exclusive, and tree. Brainspace connected tags only allow for a single choice tag with multiple options. This is the same as a mutually exclusive tag set in Reveal Review. You will be presented with another popup containing Reveal Review tags and work folders. Reveal Review supports 3 types of tag sets; multi-select, mutually exclusive, and tree. Brainspace connected tags only allow for a single choice tag with multiple options. This is the same as a mutually exclusive tag set in Reveal Review.

image34.jpg

For a multi-select tag set, multiple connected tags are created in Brainspace each with a single tag as an option. For example:

image35.png

Similarly for a tree type tag set, multiple connected tags are created in Brainspace with a single tag as an option. The tree levels are flattened out.

image36.jpg

Work folders appear at the bottom of the list with the prefix “WF:”.

image37.jpg

Check the tags and work folders you wish to connect and then press the Connect button. The popup will show the selected tags/work folders.

image38.jpg

If you hover in the rightmost column of the grid, you can pull the document ids with the tags or work folders from Reveal Review into Brainspace. You can also choose to disconnect the tag.

image40.jpg

Once you pull from Reveal Review for each one you can see the document counts.

image39.jpg
Using Connected Tags Within Brainspace

Connected tags can be used in multiple places in Brainspace. One example of where you might use them is when performing an advanced search:

image41.jpg

Another place where they are useful is when you click on a document, the connected tags are displayed on the left-hand side.

image42.jpg
Connected Tag Auto Syncing from Reveal Review

Once connected tag setup is complete in Brainspace they can be set to auto sync automatically from Reveal Review to Brainspace. Go to Project Admin->Settings->Brainspace and setup the following fields.

image43.jpg

There are two types of syncing, timed based and on-demand. On-demand occurs automatically when a user switches to the Brainspace tab that is embedded within Reveal Review.

There is also a force sync button under the admin area.

image44.png
Predictive Coding – TAR 1.0 Workflow Initialization

Most of the Predictive coding configuration in Reveal Review is done automatically when you create a new classification in Brainspace. With a dataset open in Brainspace, click the Classification button located in the upper right-hand corner.

image45.png

Once the page loads, click the New Classification button.

Enter a name for the classification and select Predictive Coding as the workflow type. Click the Next: Validation Settings button.

The settings on the next popup are beyond the scope of this document. In order to test the workflow you can leave the defaults as is with the exception of the number of control set documents to review toward the bottom. Lower this to a manageable number and click Start Session.

image46.png

There will be a small delay while the setup is done in Reveal Review after which you will see the following:

image47.jpg

The popup is polling and will show progress updates as documents are reviewed in Reveal Review. Once 100% percent are complete, the Retrieve Control Set button will be enabled.

image48.png
Predictive CodingTAR 1.0 WorkflowReviewing Documents

The following items are created automatically in Reveal Review when predictive coding is initialized within Brainspace.

Predictive Coding Team

A new team that is added to the tag profile and work folders mentioned below. This team does not initially contain users. The admin must set them when the TAR session starts or use a different existing team and add that team to the work folder and tag profile manually.

BDPC Needs Review Field

All documents that need reviewing for the current round. (This just means the documents need to be reviewed in the current round, not whether or not they have actually been reviewed.) Reviewers do not typically need access to this field. This field is automatically deleted once the TAR session is closed in Brainspace.

BDPC Predictive Rank Field

Score assigned to documents after each round and when TAR session is closed. Can be used to validate round results and final coding. Needs to be added manually by admin to a field profile but probably not required by normal reviewers.

BDPC Use For Training Field

Whether a document will be used for training in Brainspace. Set automatically to false for the control set round and true for training rounds. This field is not required by normal reviewers. There is an admin function available in Brainspace to pull

documents marked as “Use for Training” from Reveal Review. This would be used in an admin scenario where hot docs or highly relevant or non-relevant docs can be selected in Reveal Review and used in the next training round in Brainspace to help Brainspace better code the relevancy of documents. There is a “Manual” option when creating a training round that will pull these documents from Reveal Review.

BDPC Auto Code Field

How each document was coded by Brainspace at the end of TAR the session. Value of 0 means Non-Responsive, 1 means Responsive, and NULL means not coded. Used by admin to produce the results or assign them out for further review if required.

Predictive Coding Tag Profile

Tag profile with one tag pane and the created tag set mentioned below. The predictive coding team is automatically added.

BDPC Is Responsive Tag Set

Tag Set with two tags, one for responsive and one for non-responsive. Automatically added to tag profile and tags are set to automatically trigger reviewed status.

Predictive Coding Work Folder

Created under Brainspace root folder setup under admin in Reveal Review. The predictive coding team is automatically added.

Needing Review Work Folder

Created as a subfolder under the predictive coding root and the predictive coding team is automatically added. This is the normal location where an admin would go after creating the control set or a training round to find the documents the Brainspace wants reviewed. The document can be reviewed directly from this folder or assigned by the admin.

Reviewed Batches Work Folder

Created as a subfolder under the predictive coding root. Once documents are reviewed for a round and the results are retrieved in Brainspace the documents are moved to a new batch folder under this folder. This purpose is just to memorialize the documents from each round.

Final Coding Work Folder

Created as a subfolder under the predictive coding root and used then a TAR session is closed in Brainspace. Two subfolders are created under this root with the final responsive and non-responsive coding results that were assigned by Brainspace.

Once you have initialized the predictive coding session from within Brainspace locate the “Needing Review” folder underneath the predictive coding root. It will contain the documents from your control set.

image49.jpg

The document can be reviewed directly from this folder or assigned under the normal assignment area using the work folder in a search or the “needs review” field from IMPORT_DOCUMENTS.

If assigned the predictive coding tag profile should be used. If not assigned set the predictive coding tag profile before opening a document so it is selected by default in the document.

image50.png

The predictive coding tag profile will look similar to the following.

image51.jpg

Once all documents have been reviewed the progress should be at 100 percent within Brainspace. Press the Retrieve Control Set button.

image52.jpg

The popup will close and you will be returned to the main screen for the session. You can review the control set results and then begin adding training rounds.

Note

  1. If you happened to mark all documents in the control set as responsive or non-responsive, then Brainspace will double the size of the control set and ask you to continuing reviewing those documents. Go back to Reveal Review, press the Refresh button in the left pane to refresh the work folders and then click on the “Needing Review” node to load the new documents.

  2. If you are unsatisfied with the document results of the control set, you can convert that set to a training round that was selected. A control set can be converted to a training round by pressing the circular arrow at the bottom of the control set. This will convert the set and then automatically create a new control set for review.

Predictive Coding – TAR 1.0 Workflow –Training Rounds

There are two options for selecting documents when creating a training round: Automatic and Manual. In most cases you will use Automatic to allow Brainspace to select the documents. If an admin has marked docs for training in Reveal Review you can use the Manual button to pull those in.

image53.jpg

For the first training round only random and influential are available for the Automatic selection method. There are other methods available for subsequent rounds.

Training rounds display the progress directly in the main screen.

image54.jpg

Note

It is not uncommon for documents from a previous training round or even from the control set to be included in a new training. This means you may see a percent complete immediately after creating the round without reviewing any documents from that round. Brainspace has verified that this is normal behavior and the documents do not need to be re-reviewed during the round.

Once the set has been created, return to the “Needing Review” folder in Reveal Review and review the documents in the same manner as the control set. If you are still logged into Reveal Review and sitting on the main form, you will need to press the refresh button on the left pane and then click on the “Needs Review” work folder node to load the new documents on the right-hand side.

Predictive Coding – TAR 1.0 Workflow – Closing the Session

Once you have created enough training rounds and the precision, recall, and F-Score have stabilized and you are satisfied with the results, you can click the close session button on the top right.

image55.jpg

It can take a moment to store the results in Reveal Review. There is a small dark spinner that is displayed while this is taking place.

After you have closed a session you can re-open and continue creating training rounds. You will have to close and re-open to update the results in Reveal Review.

image57.jpg

A final auto coding will contain the responsive and non-responsive documents as coded by Brainspace. The reviewed batches shown below are created after each round and store the documents reviewed during the round. The first batch is the control set round.

image56.jpg
Predictive Coding – TAR 1.0 Workflow – Viewing final ranking

There are 5 total fields of interest that you can add to a field profile so they can be displayed in the document grid.

Brainspace-Reveal_Connector_fields_of_interest.png
  • BDPC Needs Review” and “BDPC Use For Training” are set while reviewing documents during the control set and training rounds and would mainly be useful for an administrator.

  • “BDPC Predictive Rank” is updated for each document after each round and the final result is stored when you close the session.

  • BDPC Is Responsive” is the Reveal Review tag set and displays how the reviewer coded in Reveal Review.

  • BDPC Auto Code” field is the final responsive/non-responsive coding result that Brainspace assigned. A value of true means responsive, false means non-responsive, and blank means not coded. You can view these two side by side to check for inconsistencies.

You can also create and run a BDPC coding consistency search. This is a search of all training documents which have a predictive coding score that is not consistent with the manual coding done by the reviewer. You want documents tagged by a reviewer as responsive but have a score less than 0.5 or tagged as non-responsive by the reviewer with a score greater than 0.5.

Pulling Brainspace Cluster information into Reveal Review

Brainspace cluster information can be pulled into and displayed in Reveal Review instead of normal Reveal clustering. Go to Reveal Project Admin->Settings->Concept Clustering and switch from Reveal document clustering to Brainspace clustering.

image63.png

The indexing batch service is used to pull the Brainspace cluster log files using the Brainspace REST API. These files are stored under the location set by system setting BrainspaceReportFolder. It then processes and stores the clusters in the normal tables used for Reveal clusters.

Check the service log files to determine when the process is complete.

For Brainspace clusters, the cluster type in the CLUSTER_SETTINGS table is 2 instead of 1 and in theory you could switch back to Reveal clusters without having to re-build them in the Ops tool.

When using the Brainspace cluster wheel, you will notice that the main parent clusters are located in the inner circle and child rings are added as you continue out. The counts displayed for a parent will include any child counts. The counts and documents displayed in the grid in Reveal Review mimic this same behavior.

image64.png

Note

The length of the displayed cluster name in the tree is controlled by system setting BrainspaceClusterLabelLength.

image65.jpg

When you hover over a node the tooltip will appear that displays all of the terms for the node plus the count and the cluster type. Normal clusters do not display the type; excluded, near duplicate and exact duplicate will display the type.

image66.jpg
CMML Workflow

In order to configure and use Brainspace’s CMML workflow, you must first connect tags from a Reveal case to a Brainspace Dataset. In the Administration menu, find the Dataset you wish to connect and click the ‘tag’ button. This will bring up the manage tag dialog box.

image67.jpg

Select connect tags and the list of all tag sets from the case connected to this Dataset will be displayed.

image68.jpg

Select the tag set that contains the tags that will map to the Responsive/Non-Responsive categories in Brainspace. At this point the tags will be connected, but the values of the tags will not be available in Brainspace until the tag data is pulled.

image69.jpg

At the tag set name select the Pull Tag option to pull the tag information from Reveal into Brainspace.

image70.png

The CMML workflow is configured within the Supervised Learning module.

image71.png

Create a new classifier by selecting the New Classifier button. Give the Classifier a descriptive name and then select the Type. In this scenario, we are using CMML.

image72.jpg

Click the Settings button to continue with the configuration. This is where the Reveal tags will be associated with the Brainspace Positive/Negative options. Once selected and after a short wait any counts associated with the linked tags will also be displayed on the screen. These values should match what is in Reveal and what was seen when the tags were Pulled into Brainspace.

image73.jpg

Once the CMML configuration is saved, the initial statistics will be displayed for the Classifier. If the ultimate goal is to use the tag information from Reveal to score the rest of the document universe in Brainspace and export that score to Reveal, the score will be exported to the field named in the upper left-hand corner next to the Score Field label.

image74.jpg

To train the rest of the document set on the documents previously tagged in Reveal and synced with Brainspace, hit the Train Now button. This will train the documents and export the scores to Reveal. If this is the first time the training has been run, the custom field to receive the Brainspace score will automatically be created. The field is called BDCMML Score.

image75.jpg

With the scores generated in Brainspace and exported to Reveal, the user must add the newly created field to a field profile in order to see the values. In Reveal, in the Project Admin area go to the Fields tab. Select the Field Profile to add the newly create score to and select Assign Fields. Adjust the field ordering, if desired.

image76.jpg

The field will now be a part of the results grid and the values will range from 0.00 to 1.00 with 1.00 being most likely responsive.