GRRDUser Manual

* The preview only display some random pages of manuals. You can download full content via the form below.

The preview is being generated... Please wait a moment!
  • Submitted by: Austin VanDyne
  • File size: 2.9 MB
  • File type: application/pdf
  • Words: 7,253
  • Pages: 36
Report / DMCA this file Add to bookmark

Description

Microsoft Word Add-In for the GenePattern Reproducible Research Document July 2009

Introduction.................................................................................................................................. 3 About GenePattern.................................................................................................................. 3 How GenePattern and the GRRD Add-In Work Together.........................................................4 Reproducibility of Document Interactions.................................................................................4 Installing and Uninstalling the GRRD Add-In................................................................................5 Requirements........................................................................................................................... 5 Installing the GRRD Add-In......................................................................................................5 Uninstalling the GRRD Add-In..................................................................................................5 Temporarily Disable the Application..........................................................................................6 GRRD Add-In Overview............................................................................................................... 7 GenePattern Tab...................................................................................................................... 7 Wizard versus Form................................................................................................................. 8 GRRD Add-In Icons.................................................................................................................. 9 Connecting to a GenePattern Server.........................................................................................10 Creating a New Connection...................................................................................................10 Managing Connections........................................................................................................... 12 Inserting Pipelines into a Document...........................................................................................13 Working with Images that have Associated Pipelines.............................................................18 Looking at Pipelines in a Document...........................................................................................20 Rerunning a Pipeline................................................................................................................. 22 Viewing and Saving Pipeline Results.........................................................................................28 Viewing Pipeline Results........................................................................................................28 Rerunning a Pipeline Generates Temporary Results..............................................................28 Saving Pipeline Results..........................................................................................................29 Using Visualizers to Display Results......................................................................................33 Exporting Pipelines from a Document........................................................................................34 Exporting a Pipeline Using the GRRD Add-In.........................................................................34 Exporting a Pipeline Using GenePatternDocumentExtractor..................................................34 Controlling Document Size........................................................................................................ 35 Troubleshooting......................................................................................................................... 36 GenePattern Tab Is Not Visible..............................................................................................36 F1 Displays Help for Word.....................................................................................................36

2

Introduction Through its analytic workflows (pipelines), the GenePattern computational biology software environment provides a way to create and distribute an entire computational analysis methodology in a single executable script, enabling a form of in silico reproducible research. The GenePattern Reproducible Research Document (GRRD) Add-In allows authors working in Word 2007 to embed GenePattern pipelines in their documents. The readers of such documents can then re-run the pipelines on any GenePattern server without leaving the Word environment. The GRRD Add-In provides an excellent way to publish final results and research allowing easy replication of computational analyses and also can be used to organize in silico experiments. This manual describes the use of the GRRD and assumes that you are familiar with GenePattern. If you are new to GenePattern, we include a brief description below. For a more in depth description of the application, see the GenePattern Concepts Guide.

About GenePattern GenePattern1 (http://genepattern.org) combines a powerful scientific workflow platform with more than 100 genomic analysis tools or “modules.” The module repository, which is constantly updated, includes tools for gene expression analysis; proteomics data analysis; copy number, LOH and SNP analysis; modeling of flow cytometric data; network/pathway analysis; data preprocessing and data conversion. Visualization modules display data graphically and allow interactive manipulation of data. Users can create pipelines that combine analysis and visualization modules into a single, reusable workflow. The GenePattern software is freely available for download at http://genepattern.org. In addition, researchers around the world have free access to a public GenePattern server hosted at the Broad Institute (the GenePattern public server) through a simple web interface (see the GenePattern User Guide). GenePattern has a client-server architecture. The server is the back-end analytic engine and the client is the user interface. You use a GenePattern client (such as the GenePattern web interface) to connect modules to form a reusable pipeline and run the pipeline on the server. When the pipeline job completes, the pipeline and its input and result files are stored as a “job” on the GenePattern server. A pipeline created on one GenePattern server can be installed and run on any other GenePattern server. Biomedical investigators and clinicians use GenePattern pipelines during all phases of a research project: 

Active research – Capturing and rerunning complex computational methods as pipelines, using pipelines to run analysis methods against alternative datasets, or modifying pipelines to experiment with variations of a method. GenePattern records each change to a pipeline, ensuring accurate reproduction of analysis results.

1

Reich, M., Liefeld, T., Gould, J., Lerner, J., Tamayo, P., Mesirov, J.P. GenePattern 2.0 Nature Genetics 38 no. 5 (2006): pp500-501 doi:10.1038/ng0506-500.

3

 

Publishing results – Packaging analysis methods, data sets, and research results as pipelines for review, validation, distribution, and publication. Training – Using pipelines to introduce novice users to complex analysis methods.

How GenePattern and the GRRD Add-In Work Together The GRRD Add-In can be thought of as a GenePattern client integrated into Word. While the Add-In doesn’t provide the full functionality of a GenePattern client, it allows you to connect to a GenePattern server, to insert pipelines into a document, and to rerun the inserted pipelines on a GenePattern server. Each pipeline in a Word document is associated with an image or with text via a GenePattern link icon. A person reading the document can click on such an image or icon to view the associated pipelines and their input files, and rerun the pipelines on any GenePattern server while remaining in the Word application.

Reproducibility of Document Interactions To maintain a history of your interactions and reruns within the document, insert the pipeline results. This creates a new section, or appendix, at the end of the document, which includes the pipeline name, your comments, the parameter values used to rerun the pipeline, and the generated result files. The inserted text contains all of the information needed to reproduce the pipeline results.

4

Installing and Uninstalling the GRRD Add-In Requirements The GRRD Add-In is currently available for Word 2007 on the Windows operating system. It has been tested on platforms that run the native Windows operating system and that run Windows using the VMware virtualization software. Documents edited using the Add-In should be saved in the Word document (.docx) format. Note: Users who do not have Word 2007 can use the GenePatternDocumentExtractor module to extract pipelines and data from a Word document. The extracted pipelines can then be installed and run on any GenePattern server. See Exporting Pipelines from a Document for more information.

Installing the GRRD Add-In Note: You must be logged on as an administrator to install software on your computer. To install the GRRD Add-In: 1. Exit Word. It is assumed that Word 2007 is already installed. 2. Download and install pre-requisite Office-related software from http://www.microsoft.com/DOWNLOADS/details.aspx?FamilyID=54eb3a5a-0e52-40f9-a2d1eecd7a092dcb. 3. Download and install the GRRD Add-In installation package from: http://genepatternwordaddin.codeplex.com. 4. Follow the provided installation Wizard. To verify that the installation was successful, open Word and look for the GenePattern tab. The GenePattern tab appears to the right of the default Office tabs:

If the installation was not successful, see Troubleshooting for more information.

Uninstalling the GRRD Add-In To uninstall the GRRD Add-In: 1. 2. 3. 4.

Exit Word. Go to Start -> Control Panel -> Add or Remove Programs. Select the “GenePattern Reproducible Research Document Add-In”. Click the Remove button. The GRRD Add-In will be uninstalled.

5

Temporarily Disable the Application To temporarily hide the GenePattern functionality without uninstalling the application: 1. In Word 2007, click the Office Button and then the “Word Options” button. Word displays the Word Options window.

2. In the Word Options window, click “Add-Ins” in the left column and then the “Go” button after “Manage: COM Add-ins” at the bottom of the screen. Word displays the COM Add-Ins window.

3. In the COM Add-Ins window, clear the box next to “GenePattern Reproducible Research Document Add-In” and click OK. 4. To re-enable the GenePattern tab, repeat steps 1-3, but check the box in step 3 instead of clearing it.

I think before the next section you need a figure showing the layout of the screen so people know where these various panes you’re referring to actually are!!

6

GRRD Add-In Overview GenePattern Tab Installing the GRRD Add-In adds a GenePattern tab to Word 2007:

Tool

Description Displays the Dashboard pane, which shows each image or GenePattern link icon that is associated with a GenePattern pipeline and lists all GenePattern content that is embedded in the document. See Looking at Pipelines in a Document for more information. Displays the Associated Pipelines pane, which shows detailed information about the GenePattern pipelines associated with the currently selected image or GenePattern link icon. See Looking at Pipelines in a Document for more information. Displays a dialog that allows you to password protect this document. Users who do not know the password cannot modify the document text or pipeline associations. Associates a GenePattern pipeline with the currently selected image. See Inserting Pipelines into a Document for more information.

Associates a GenePattern pipeline with text via a GenePattern link icon. See Inserting Pipelines into a Document for more information.

Reruns a GenePattern pipeline associated with the currently selected image or GenePattern link icon. See Rerunning a Pipeline for more information.

Removes all GenePattern content from the document. This includes all pipelines, modules, input files, parameter settings, and saved pipeline results. The Wizard user interface (UI) helps you insert or rerun pipelines. Clear this checkbox to replace the Wizard UI with a forms-based UI. See Wizard versus Form for more information.

7

Displays the Connections pane, which shows all connections between Word and GenePattern servers. Use the Connections pane to add, edit, and remove server connections. See Connecting to a GenePattern Server for more information.

Selects a previously configured server connection. See Connecting to a GenePattern Server for more information.

Configures a new server connection. See Connecting to a GenePattern Server for more information.

Wizard versus Form When you insert or rerun a pipeline, you can use either a Wizard or a form: Select the Use Wizard UI checkbox on the GenePattern tab to have the Wizard walk you through the operation step-by-step:

Clear the Use Wizard UI checkbox on the GenePattern tab to display all options for the operation in a single form:

You must complete the Wizard before continuing to work in Word.

Using the form allows you to switch between Word documents while inserting or rerunning pipelines.

Wizards and forms both offer the same options and perform the same operations; which you use is personal preference. This guide uses the Wizard UI to describe these options and operations.

GRRD Add-In Icons File browsers often use icons to identify common file types. In the same way, the GRRD Add-In uses the following icons to identify different GenePattern elements listed in Wizards and forms: Pipeline job on a GenePattern server.

8

Pipeline embedded in the Word document. For more information, see Controlling Document Size. Pipeline referenced by URL in the Word document. For more information, see Controlling Document Size. GenePattern analysis module. GenePattern visualization module. Pipeline input parameter. Pipeline result file. Pipeline result file highlighted by the author as being of primary interest to the reader.

9

Connecting to a GenePattern Server You must have a connection to a GenePattern server to rerun a GenePattern pipeline or to associate a GenePattern pipeline with a document. You may use the public GenePattern server hosted at the Broad Institute, a server running on your local machine, or a shared server in your lab or department. If you work with multiple GenePattern servers, you can add connections to all of the servers and switch between them as needed. GenePattern server connections are saved when you exit from Word and restored when you restart Word. The current connection is shown in the “Current Connection” dropdown on the GenePattern tab. This is the server used for viewing and rerunning pipelines. You can change the current connection at any time by selecting a different one from the dropdown menu or the Connections pane.

Creating a New Connection To create a connection to a GenePattern server: 1. Click the Create Connection option on the GenePattern tab. The GRRD Add-In displays the GenePattern Server Connection form.

2. Fill in the form. 



To connect to the GenePattern public server, click the “Connection To GenePattern Public Server” button. After the GRRD Add-In refreshes the Server Connection form, enter the username and password that you use to log into the public server. (If you do not have an account on the public server, they are freely available through registration at http://genepattern.broadinstitute.org/gp/.) To connect to another server, enter the server name (for example: myserver.domain.com), port (for example: 8080), username and password (if required).

10

You may want to give the connection a more descriptive name than “Connection 2” (for example: “Local GenePattern Server”), especially if you are going to switch between multiple servers. 3. Click the “Test Connection” button to check that you have entered the correct server name, port, and credentials. GenePattern servers may use port numbers other than the default of 8080. If the connection succeeds, the GRRD Add-In displays a “Test Successful” message. If the connection fails, check that the server is running (try logging into the GenePattern server outside of Word) and that you entered the correct port, username, and password. If the connection still fails, contact the GenePattern server administrator.

4. Click OK on the Server Connection form to add the new connection.

11

Managing Connections To manage connections, click the Connections tool on the GenePattern tab to open the Connections pane.

The top panel of the Connections pane shows all of the available pipelines on the currently selected GenePattern server. The items in this list are restricted based on your user account; you will see all public pipelines and your own private pipelines. The bottom panel of the Connections pane shows all of your configured connections. You can add, edit or delete connections using the provided buttons. To switch to a different GenePattern server, select a connection and click the “Set as current connection” button. This is the same as selecting a server from the Current Connection dropdown on the GenePattern tab, as described in Connecting to a GenePattern Server. If you attempt to access a GenePattern server that is no longer available, the GRRD Add-In displays a “connection failed” message. To correct the error, check that the GenePattern server is running and try again to access it again or switch to a different server.

12

Inserting Pipelines into a Document Once you have established a connection to the GenePattern server, you can insert pipelines from that server into your document. As an author, this can help you keep track of which pipeline jobs on the server produced your final results. It also allows anyone to whom you distribute the document to reproduce your research. Note: You must have run the pipeline on the server before you can insert the pipeline into the document. The GRRD Add-In uses the pipeline job to set default values for pipeline parameters and input files. A pipeline can be associated with an image or with text via a GenePattern link icon. For example, if you have an image in a paper that summarizes the results from one or more pipelines, you can associate that image with the pipeline(s) that generated the results represented in the image. Alternatively, if you have text or a table that describes the results from one or more pipelines, you can insert a small GenePattern link icon at that point and associate the pipeline with the icon. For example: …The method builds predictive models using marker genes that are significantly differentially expressed between two subtypes of leukemia, acute lymphoblastic (ALL) and acute myelogenous (AML). To insert a pipeline into a document: 1. Locate the image or text with which you wish to associate the pipeline. To associate a pipeline with text you will need to insert a GenePattern link icon. Image

GenePattern link icon

1. Select the image. 2. Click the Associate Pipeline tool on the GenePattern tab.

1. Position the cursor where you want to insert the link icon. 2. Click the Insert GenePattern Link tool on the GenePattern tab.

The GRRD Add-In starts the Associate Pipeline Wizard.

The GRRD Add-In starts the Associate Pipeline Wizard. It will insert the link icon into the text when it inserts the pipeline into the document.

2. Use the Associate Pipeline Wizard to associate a pipeline with the image or link icon. This section describes the Wizard. If you prefer a forms-based UI, clear the “Use Wizard UI” checkbox on the GenePattern tab.

13

Begin This screen indicates the task to be performed. If you do not want to see this step the next time you run the Wizard, clear the “Show this screen next time I start the wizard” option. Click Next.

Select Pipeline Use the Select Pipeline screen to select the pipeline to associate with the image or link icon. You can select a pipeline already associated with this document or a pipeline job from the server. (Modules are inserted into a document only as part of a pipeline; therefore, module jobs on the server are not listed.)

Use the radio buttons above the pipelines list to display either the pipelines already associated with this document (“Use Job From Document”) or the pipelines on the GenePattern server (“Use Job From Server”). Note: If you have not yet added pipelines to the document, the screen will list the pipelines on the server; it will not have these two radio buttons.

14

Use the text box above the list to filter the pipelines that appear. For example, to see only pipelines that contain “Golub” in their name, type “Golub” in the text box and click the Refresh button. Use the last two radio buttons on the screen to determine how the document stores the pipeline and its modules. Select “Store Pipeline in Document” to copy the pipeline into the document. This allows anyone reading the document to rerun (or export) the pipeline, but increases the size of the document. Select “Refer to Pipeline on GenePattern Server” to store in the document a pointer to this pipeline. This keeps the document smaller, but a reader must have access to the server on which the pipeline resides to rerun (or export) the pipeline. For more information, see Controlling Document Size. After selecting the pipeline to associate with the image, click Next. Parameters The Parameters screen displays the pipeline parameters and their values. The parameter values are useful for confirming that you have selected the desired pipeline job. On this screen, the parameter values are for review only; however, a person rerunning the pipeline will be able to change the parameter values. Click Next. Tip: Pipeline parameters allow the reader to rerun the pipeline with different values. For example, if you want to allow the reader to rerun the pipeline using a different data file, the pipeline must have an input file parameter that allows the reader to specify an alternate data file. In GenePattern, to add parameters to a pipeline use the “prompt when run” checkbox when creating or editing the pipeline, as described in the GenePattern User Guide.

Stop Point Pipeline execution begins at the first step, but can be stopped at any point. Use this screen to set a stop point for the pipeline. For example, if the associated image is based on the 4th step of a 10 step pipeline, you might set the stop point at step 4. Setting the stop point at the last step runs the complete pipeline. Setting the stop point at an appropriate, earlier step reduces the

15

pipeline run time for a reader who is reproducing your in silico experiment. After selecting the stop point, click Next. Note: This screen sets the default stop point for the pipeline. The person rerunning the pipeline can accept this default or choose an alternate stop point.

Primary Result This screen lists the pipeline result files. Select the result file of primary interest to the reader. For example, you might select the result file that contains the results shown in the associated image. When a reader views the pipeline association, the primary result file is highlighted. Select a result file and click Next.

16

Annotation In this screen, enter a name and description for the pipeline association. The name should be something simple that helps you remember this association. The description contains notes about the pipeline and/or the image that might be useful to the reader. For example, you might describe how the associated image was generated from the pipeline result file or the visualizer settings used to capture that image. After entering the association name and description, click Next.

Associations to Add This screen lists the pipeline associations that you have chosen to add to the selected image or link icon. Typically, it shows one new pipeline association.

Occasionally, you may want to add additional pipeline associations to the selected image or link icon. For example, an image that summarizes the results from different pipelines or contains a figure composed of multiple images (e.g., Fig. 2a, 2b, 2c) might need multiple pipeline associations. Click the “Add Another” button to add another pipeline association. If you have 17

defined multiple pipeline associations, the Delete button becomes available and can be used to remove one or more of the associations. When the screen lists the pipeline associations that you want to add, click the Finish button. Finish When you click Finish, the GRRD Add-In displays a progress indicator as it downloads the pipeline data to associate with the image. During this process, the progress indicator displays a Cancel button. When it’s complete, the Cancel button changes to a Close button. Click Close to close the progress indicator window.

The following options affect the length of time required for the download: 



Whether the pipeline association stores a copy of the pipeline or references the pipeline on a server. Downloading a copy of the pipeline takes significantly longer than downloading a reference. For more information, see Controlling Document Size. Whether input files for the pipeline are stored on the GenePattern server or referenced by URL. Downloading a file from the server takes significantly longer than downloading a URL reference. For more information, see Controlling Document Size.

To view the new pipeline associations, click the image or link icon and select the View Associations tool on the GenePattern tab (see Looking at Pipelines in a Document).

Working with Images that have Associated Pipelines When an image or GenePattern link icon is copied to the clipboard, its pipeline associations are not copied with it. This prevents pipeline associations from being copied between documents. It also imposes the following caveats for working with images and link icons that have pipeline associations. Moving Images Cutting an image or GenePattern link icon deletes all of its pipeline associations, even if it is immediately pasted elsewhere in the document. To move an image or link icon without losing its pipeline associations, it is best to drag it. Copying Images When you copy-and-paste an image or GenePattern link icon that has a pipeline association, the original and copied images share a single pipeline association. Such shared pipeline associations are not recommended and may cause errors. Therefore, use copy-and-paste

18

operations only to move an image or link icon. Delete the original image or icon to avoid the shared pipeline association. Note: Having the same pipeline associated with multiple images or link icons is fine, provided that you add the pipeline associations as described in Inserting Pipelines into a Document. The undesirable side effects of “shared associations” occur only when you copy-and-paste pipeline associations. Changing Images Deleting an image or GenePattern link icon deletes its pipeline associations. If you have associated one or more pipelines with an image and find that you need to change the image, use the “Change Picture” tool in Word to replace the image. Warning: The “Change Picture” tool (accessed from the Word rightclick menu) does not change the corresponding thumbnail displayed in the Dashboard pane.

Undoing Edits on Images Word uses the clipboard for undoing and redoing operations. Since pipeline associations are not copied to the clipboard, you cannot always undo or redo operations that include pipeline associations. For example, if you cut an image that has a pipeline association, undoing that operation restores the image, but not the pipeline association.

19

Looking at Pipelines in a Document Each pipeline in a document is associated with an image or with text via a GenePattern link icon. All such pipeline associations are managed through two panes – the Dashboard pane and the Associated Pipelines pane.  

The Dashboard pane displays all images and link icons that have associated pipelines. Use this pane to find, select and scroll to pipeline associations. The Associated Pipelines pane displays all pipelines associated with a selected image or link icon. Use this pane to view, rerun, or export a pipeline.

To look at a pipeline in the document: 1. Open the Dashboard pane by clicking the Dashboard tool on the GenePattern tab. The top portion of the Dashboard pane shows a thumbnail of each image or link icon in the document that has associated pipelines. The names of the pipeline associations are displayed as image captions. The bottom portion of the pane lists all of the GenePattern content that is embedded in the document. The icons identify the type of content, as described in GRRD Add-In Icons. 2. Double-click an image or click an image and then the “Scroll to Picture” button. The GRRD Add-In scrolls to that image in the document and selects it.

3. Open the Associated Pipelines pane by clicking the View Associations tool on the GenePattern tab. The top portion of the Associated Pipelines pane lists the pipelines associated with the currently selected image. To add an association to this list, use the instructions in Inserting Pipelines into a Document. 4. Click a pipeline to display its details. Table 1 shows the Associated Pipelines pane and describes each element.

20

Table 1. Associated Pipelines Pane The first panel lists the pipeline associations. Select an association to view its details. Documentation displays any available documentation for the pipeline. You can choose to view the documentation or save it to disk. Edit displays the Associate Pipeline Wizard, which is used to edit the pipeline association. See Inserting Pipelines into a Document for a description of the Wizard. Delete deletes the pipeline association. Annotation displays a description of the pipeline association. Pipeline steps lists the modules run by the pipeline in the order that they are run. Export exports the pipeline and its modules from the document to a zip file that can be imported into GenePattern. Run reruns the pipeline. See Rerunning a Pipeline for more information. The last panel displays pipeline result files, image files, and parameter settings. See Viewing Pipeline Results for more information.

21

Rerunning a Pipeline To rerun a GenePattern pipeline associated with a document: 1. Select an image or link icon associated with the pipeline (see Looking at Pipelines in a Document). 2. Select the pipeline association in one of two ways: Rerun Pipeline tool

Run button in Associated Pipelines pane

1. Click the Rerun Pipeline tool on the GenePattern tab. The GRRD Add-In displays the Rerun Pipeline Wizard. 2. Select the pipeline association by following steps 1 and 2 of the Wizard (the Begin and Select Pipeline screens).

1. Click the View Associations button on the GenePattern tab to display the Associated Pipelines pane. 2. Select the pipeline association from the Associated Pipelines pane. 3. Click the Run button in the Associated pipelines pane. The GRRD Add-In displays step 3 of the Rerun Pipeline Wizard.

Note: If you have previously rerun this pipeline, the earlier results are overwritten. No warning message is displayed.

Note: If you have previously rerun this pipeline, a message asks if you want to discard the earlier results. Click OK to rerun the pipeline, overwriting the earlier results. Click Cancel to stop.

3. Rerun the pipeline by completing the Wizard steps 3 (Parameters) through 6 (Finish). This section describes the Wizard. If you prefer a forms-based UI, clear the “Use Wizard UI” checkbox on the GenePattern tab. Saving pipeline results: The GRRD Add-In temporarily stores the most recent rerun’s provenance and results. To maintain this information within the document, save the pipeline results as described in Viewing and Saving Pipeline Results. The GenePattern server you use also maintains a full history of your reruns and interactions.

22

Begin This screen states the task to be performed. If you do not want to see this step the next time you run the Wizard, clear the “Show this screen next time I start the wizard” checkbox. Click Next to continue.

Select Pipeline Use the Select Pipeline screen to select the pipeline association for the pipeline that you want to rerun. Click Next to continue.

23

Parameters The Parameters screen displays the pipeline’s input parameters (if any). By default, pipeline parameters are set to their original values. Change the parameter values if desired and click Next.

Select Stop Point Pipeline execution begins at the first step, but can be stopped at any point. The pipeline step highlighted on this screen is the stop point selected by the document author. For example, if the associated image is based on the 4th step of a 10-step pipeline, the author might set the stop point at step 4. Leave this step selected or select an alternate stop point and then click Next.

Server Settings Use this screen to choose the GenePattern server on which to rerun the pipeline, whether to receive email notification when the pipeline completes, and whether to display visualizers when the pipeline completes. Each choice is described below. Note that if there are no visualizers in the pipeline the “Show visualizers…” check box is disabled.

24

GenePattern server Choose the GenePattern server on which to run the pipeline. This option differs slightly depending on whether the document contains a copy of the pipeline or a reference to the pipeline on a particular GenePattern server (see Controlling Document Size). Document contains a copy of the pipeline. Select the server from the dropdown list. Select a previously configured server connection, create a new server connection (select “New Connection”), or connect to the GenePattern public server (select “GenePattern Public Server”). If the pipeline is not on the selected server, the GRRD Add-In automatically exports it from the document and installs it on the server. Document contains a reference to the pipeline. To run the pipeline on the original server, leave the first radio button selected. Otherwise, choose the second radio button and select a server from the dropdown list. If the pipeline is not on the selected server, the GRRD Add-In automatically exports the pipeline from the original server and installs it on the selected server. You must have configured connections to both servers. Running the pipeline on the original server is the most efficient choice as the pipeline is resident there.

If you have not yet configured a connection to the selected server, the GRRD Add-In displays the Server Connection form. Enter the requested information to configure the connection. For more information, see Connecting to a GenePattern Server.

25

Email notification To request that the GenePattern server send you email when the pipeline finishes executing: 1. Check the “Send email when job is finished” checkbox. 2. Enter the address to which the server should send the email. Show visualizers If the pipeline includes one or more visualizers, by default, they are displayed when the pipeline finishes executing. To prevent the visualizers from being displayed when the pipeline completes, clear the “Show visualizers when job finishes” checkbox. After rerunning a pipeline, you can show its visualizers at any time from the Associated Pipelines pane. See Using Visualizers to Display Results for more information. Finish The final screen summarizes the selected options. To start rerunning the pipeline, click Finish.

Submitting a Job A progress indicator shows a description of each step as it is submitted to the GenePattern server. During this process, the progress indicator displays a Cancel button. When the pipeline had been submitted, the Cancel button changes to a Close button. Click Close to close the progress indicator window.

26

View the pipeline results on the Associated Pipelines pane. While the pipeline is running, the Associated Pipelines pane shows a progress indicator. The results appear when the pipeline completes. See Viewing Pipeline Results for more information.

27

Viewing and Saving Pipeline Results Viewing Pipeline Results To view pipeline results: 1. Display the pipeline details in the Associated Pipelines pane (see Looking at Pipelines in a Document). 2. The last panel of the Associated Pipelines pane displays the original pipeline results and, if the pipeline has been rerun, the rerun pipeline results. The Results tab lists all analysis results files, the Image tab lists the generated image files (if any), and the Parameters tab lists the parameter values used to run the pipeline. If the pipeline has not been rerun, the panel lists the pipeline’s output or result files. The files themselves are not stored in the document, but can be obtained by rerunning the pipeline with its original data and parameter settings.

If the pipeline has been rerun, use the radio buttons to display either the original or rerun pipeline results. The rerun result files are temporarily stored in the document and can be viewed: 



Double-click a file listed in the Results or Image tab to display the file. Click the Show Viewers button to display the visualizers run by the pipeline (if any). For more information, see Using Visualizers to Display Results.

28

Rerunning a Pipeline Generates Temporary Results Data generated by rerunning a pipeline is temporarily stored with the pipeline association. Unless explicitly saved (see Saving Pipeline Results), the generated data is overwritten when you rerun the pipeline again from this pipeline association or deleted when you close the document or exit from Word. If you close a document without saving the pipeline results, the Add-In displays the following message:

To save the pipeline results, click No and then save the results (see Saving Pipeline Results). Note: If you close a document without saving changes, Word displays the warning message shown below. Clicking Yes to this message does not save pipeline results.

Saving Pipeline Results To save the data generated by rerunning a pipeline, save the results to external files or insert them into the current document. Inserting pipeline results into the document increases the size of the document by the size of the files inserted into the document. To save the pipeline results: 1. View the results generated by rerunning the pipeline (see Viewing Pipeline Results).

2. To save the results to an external file, select a result file and click the “Save Result” button. The Add-In prompts you to save all results or the selected result file.

29

3. To insert the results into the current document, select a result file and click the “Insert into this Document” button.  

To insert all results or an individual result file, select a file from the Results tab. To insert a graphic image, select a file from the Image tab.

The following dialog appears:

  



Use the radio buttons to save the currently selected file or all result files. (To save all result files, select a file from the Results tab, not the Image tab.) Enter a comment. Leave the “Store Input Files in Document” checkbox selected to copy the pipeline input files (if any) into the current document. This increases the size of the document by the size of the input files. If the reader has access to the input files (for example, via URL), clear the checkbox to avoid increasing the size of the document. See Controlling Document Size for more information. Click OK. The GRRD Add-In creates a new section at the end of the document as described below.

Inserting the results into the current document creates a new section at the end of the document, which includes the pipeline name, your comments, the parameter values used to rerun the pipeline, and the result files. The inserted text contains all of the information needed to reproduce the results (provided that all of the necessary input files are available from the document). You can use the pipeline name to find and display the pipeline in the Associated Pipelines pane (see Looking at Pipelines in a Document) and then rerun the pipeline using the parameter values shown in the text.

30

The following figures show pipeline results inserted into a document in two ways: (1) by selecting all of the results from the Results tab and (2) selecting a single image from the Image tab. Input files and output files are represented by file icons. Double-click an icon to display the associated file. Drag-and-drop an icon to copy the file to a folder in Windows Explorer.

31

32

Using Visualizers to Display Results GenePattern visualizers, such as the HeatMapViewer and HierarchicalClusteringViewer, are special modules designed to display data. Most GenePattern modules run on the GenePattern server. Visualizers are downloaded and run on your local machine. As of GenePattern 3.1.1, visualizers require Java 1.5; see the GenePattern User Guide for the latest requirements. When you rerun a pipeline that includes visualizers, by default, the GRRD Add-In launches the visualizers when the pipeline finishes executing. After rerunning a pipeline, you can launch its visualizers at any time by clicking the Show Viewers button on the Associated Pipelines pane. To launch a visualizer, the GRRD Add-In must copy the visualizer code from the pipeline to a temporary directory on your local machine and then run the visualizer. If the pipeline has been copied into the document, the Add-In can copy the visualizer code from the document to your local machine. If the document references a pipeline on a GenePattern server, the Add-In must connect to the GenePattern server that holds the referenced pipeline and download the visualizer code from that server to your local machine. Images from a GenePattern interactive visualizer cannot be inserted into a document using the GRRD Add-In. You can save an image from the visualizer, insert it into the document using the Word picture tools, and then associate the generating pipeline with the image.

33

Exporting Pipelines from a Document You can export GenePattern pipelines and data from a document by using either Word and the GRRD Add-In or GenePattern and the GenePatternDocumentExtractor module.

Exporting a Pipeline Using the GRRD Add-In Zip files provide a convenient means of sharing pipelines among GenePattern users. You can export a pipeline from a Word document to a zip file. The zip file can then be used to install the pipeline on a GenePattern server. To export a pipeline from a Word document: 1. Display the pipeline details in the Associated Pipelines pane (see Looking at Pipelines in a Document). 2. Select the pipeline association that you want to export. 3. Click the Export button. The GRRD Add-In exports the pipeline and its modules to a zip file that can be imported into a GenePattern server. For instructions on how to import a zip file into GenePattern, see the GenePattern User Guide.

Exporting a Pipeline Using GenePatternDocumentExtractor Users who do not have the GRRD Add-In can still take advantage of GenePattern reproducible research documents by using the GenePatternDocumentExtractor module to extract pipelines and data from a Word document. The module can be installed on any GenePattern server. It takes a Word document as input and generates a zip file containing all of the GenePattern pipelines and data embedded in that document. The extracted pipelines can be installed and run on any GenePattern server. The GenePatternDocumentExtractor module is available on the GenePattern public server (http://genepattern.broadinstitute.org/gp/). It is also available from the GenePattern module repository if you would like to install it on your local GenePattern server. For instructions on how to install a module on your server, see the GenePattern User Guide.

34

Controlling Document Size When you add a pipeline association to a document, the document stores either (1) a copy of the pipeline or (2) a reference to the pipeline on a specific GenePattern server. Storing a reference to the pipeline keeps the Word document smaller. However, to rerun or export a referenced pipeline, readers must have access to the specified GenePattern server where the pipeline resides. Storing a pipeline in the Word document makes the file larger, but the embedded pipeline can then be run on or exported to any GenePattern server available to the reader. Large input files for a pipeline also increase the size of the document. If a pipeline was run with a 10 MB input file, then the document must include that input file for potential reruns. To avoid this problem, when you run the pipeline on the GenePattern server you can specify a URL for the input file instead of uploading the input file from your local computer. Now, when you associate the pipeline with the Word document, only the URL is stored in the document. Note that this should be a publicly accessible URL if you intend to distribute the document. To save the data generated by rerunning a pipeline, you can insert it into the document or save it to external files (see Saving Pipeline Results). Inserting the results into the document increases the size of the document by the size of the saved input and/or result files. To reduce the size of the document, save the generated data to external files. To see the size of the GenePattern data stored in the document, open the Dashboard pane by clicking the Dashboard tool on the GenePattern tab. The bottom portion of the pane lists the GenePattern content that is embedded in the document, as well as the approximate size of that content. If a module or pipeline appears in multiple pipelines associated with a document, only one copy of that module or pipeline is stored in the document. Note: This pane does not include pipeline results inserted into the document. Documents larger than 10-20 MB are typically too large for most email clients. You may need to use an alternate distribution method for large documents, for example, an ftp site or file-sharing site such as box.net.

35

Troubleshooting GenePattern Tab Is Not Visible The GenePattern tab can be missing for several reasons: a. .NET Framework wasn’t installed because rollback support is not enabled The .NET Framework installation will fail if you do not have rollback enabled. A Microsoft Knowledge base article about this problem and how to fix it is here: http://support.microsoft.com/support/kb/articles/q312/4/99.asp b. Word was running during the installation If you had Word open during installation (this manual, for example) then the tab does not appear until all copies of Word are closed. Restart Word and the tab should be there. c. The Add-In is temporarily disabled When the GRRD Add-In is installed, it may be disabled by default. For instructions on how to enable and disable the Add-In, see Temporarily Disable the Application. Follow the instructions to enable the Add-In. If the tab does not appear, proceed to The Add-In is disabled. d. The Add-In is disabled In addition, Word sometimes disables the GRRD Add-In if there is a crash while the Document System is executing code. The GenePattern tab may not reappear until the user manually re-enables the add-in. To do this: 1) 2) 3) 4) 5) 6) 7) 8)

Click the Office Button. Click “Word Options”. Click “Add-Ins” in the left column. Select “Manage: Disabled Items” at the bottom and then click “Go”. Check the box next to “GenePattern Reproducible Research Document Add-In”. Restart Word. Check that the GenePattern tab appears. If the tab does not appear, proceed to Repairing the installation.

e. Repairing the installation In some situations, the GRRD Add-In will not function until it has been repaired or even uninstalled and reinstalled. To repair the installation, go to Start -> Control Panel -> Add or Remove Programs and select “GenePattern Reproducible Research Document Add-In.” Click “Change” and choose “Repair” at the next screen.

F1 Displays Help for Word Moving the cursor over the buttons on the GenePattern tab displays the message “Press F1 for more help.” Pressing F1 displays help for Word not for the GRRD Add-In. Help for the GRRD Add-In is provided by this document.

36