Annotate corpus

The CLAMP-Cancer annotation module enables you to annotate customized entities and specify relations between them in your desired corpus . These annotations enable you to assign additional clinical information to a selected text and develop an annotated corpus that’s more suitable to the specific task that you have. Task-specific models can be developed and used in the machine-learning modules of CLAMP-Cancer or any other system of your choice. Before using this function, you need to:

  1. Create a project
  2. Import the files that you want to annotate

After completing these steps, you will be able to annotate the imported files based on some predefined structure. The following steps will guide you on how to perform the steps mentioned above.

  1. Create a new project:
    1. Click on the plus (+) sign at the top left corner of the screen as shown figure below.
      Step 1 to create a new project
      Step 1 to create a new project
    2. On the pop-up window, enter a name for your project, e.g., Drug_name_annotation.
       Creating a new Corpus Annotation project
      Creating a new Corpus Annotation project
    3. Select Corpus Annotation as the project type.
    4. Click the Finish button.

A new project with the name that you have specified is created and placed in the Corpus panel.

Creating a new Corpus Annotation project
Creating a new Corpus Annotation project

Double click the project name to view its content. The created project contains two main folders:
Corpus: Contains the files that will be annotated
Models: Contains the machine learning models generated from the annotated files.In addition, the prediction results generated from the n-fold cross-validation process and gold standard annotations are included in this folder.