About
Phenoflow
Phenoflow is the name for a conceptual model and a microservice architecture, which includes this fork of CWL Viewer, that aim to enhance the reproducibility and portability of computable phenotypes.
CWL Viewer
CWL Viewer is a richly featured web visualisation suite for workflows written in the Common Workflow Language with an aim of facilitating sharing, understanding and discovery as well as encouraging best practices when writing workflows and their tooling.
Cite as: https://doi.org/10.7490/f1000research.1114375.1
Technical Report: https://doi.org/10.5281/zenodo.823295
CWL Viewer also won the F1000Research Best Poster Award at ISMB/ECCB 2017 for its poster submission.
This project was developed at the eScience Lab at The University of Manchester, with work supported by Bioexcel, funded by the European Union Horizon 2020 program under grant agreement 675728.
Contributions are welcome in the form of issues and pull requests to the Github repository.
Privacy policy
CWL Viewer publishes visualizations of workflows from publicly available git repositories hosted by third-parties like github.com or gitlab.com. Anyone can submit a workflow, which will be added to our public listing.
Tracking usage
We do not track individual users of CWL Viewer, but we do record general usage (e.g. web server access log) for operational purposes and to prevent abuse. We may use HTTP session cookies in order to assist workflow submission, but do not use cookies to identify users.
What information is held?
We hold information about public open source workflows in order to visualize them graphically and textually, as well as making their declared metadata accessible to the public in different formats such as linked data. This information may be held until explicitly requested for removal, however we reserve the right to remove any workflow from listing without prior notice.
Metadata shown from the public workflows may include personal data, including authorship or as part of workflow descriptions. We retrieve this information from the submitted git repository. Downloading a workflow or its metadata may include information from the git repository not otherwise shown in the CWL Viewer interface, e.g. authors from git commit history.
For performance reasons the CWL Viewer may keep a copy of the checked out git repository and the derived metadata. We may at a later date retrieve published changes from the original repository to update the information held.
Where is information exposed?
Workflows and their metadata can be accessed in CWL Viewer through the public listing by browsers, programmaticaly through the API, and can be downloaded in multiple formats like ZIP, SVG or RDF.
CWL Viewer generates and exposes permalinks which reference the git commit and the workflow path within the git repository, but not the git repository location or username. These permalinks are only resolvable with the public https://view.commonwl.org/ if it has previously visualized a corresponding public git repository.
Metadata from public workflows may be published to the OpenAIRE registry, including author names and workflow title.
Best Practices
In order to ensure that your workflow is well presented in CWL Viewer, we recommend the following of CWL Best Practices. Those which are specifically relevant to the viewer are detailed below, but it is suggested that you try to meet as many as possible to include the general quality and reproducibility of your workflows.
Some limitations of the CWL Viewer which you may need to be aware of are also described here.
Label Strings
Include a top level short label summarising each tool and workflow
Labels give the user an easy human-readable version of the name for the tool or workflow
For workflows this will be displayed at the top of the page as the title and for tools it will be
                displayed in the table and as the name of the step in the visualisation. If a label
                is given at the step level, it will take priority over the top level tool label. You can
                use this to provide a more descriptive label of the tool's application in the particular step if
                preferred.
Doc Strings
If useful, include a top level doc string providing a longer, more detailed description
                than was provided in the label (see above)
Docs give the user a detailed description of the role a tool or workflow performs
For workflows this will be displayed at the top of the page under the title and for tools it will be
                displayed in the table. If a doc string is given at the step level, it will take priority
                over the top level tool doc. You can use this to provide a more descriptive label of the tool's
                application in the particular step if preferred
Conceptual Identifiers
All input and output identifiers should reflect their conceptual identity.
               Generic and uninformative names such as result or input/output
               should be avoided
Helpful identifiers allow for the links between steps in the CWL file to be easily distinguished
Identifiers are displayed in the tables and are unique to the step. The label is also
            used as a replacement for the identifier in the visualisation if provided.
Format Specification
The format field should be specified for all input and output Files
                Tools should use format identifiers from a relevant ontology such as the
                EDAM Ontology in the case of Bioinformatics tools.
                For plain types use the
                IANA media type list with
                $namespaces: { iana: "https://www.iana.org/assignments/media-types/" }, for example
                iana:text/plain, iana:text/tab-separated-values
            
The use of formal standards for format fields enables implementations to provide checks for compatibility in formatting of files
Ontologies will be parsed and the name of and link to the format displayed in the table on workflow pages. Plain formats will have the iana.org link given but will not display the name of the format.
Separation of Concerns
Each CommandLineTool description should focus on a single operation only, even if the
                (sub)command is capable of more.
This allows for easier reuse of the tool in other workflows and understanding as to it's purpose
In CWL Viewer this ensures that steps are clear in purpose within the workflow and generated visualisation
JavaScript Elimination
Evaluate all use of JavaScript for possible elimination or replacement. For instance, for the
                manipulation of File names and paths, often one of the built in File
                properties such as basename, nameroot, nameext etc
                could be used instead
Tool runners can implement more efficient implementations of built in functionality, which makes JavaScript expressions a last resort
CWL viewer does not take into account JavaScript expressions when extracting information about your workflows
Use of Subworkflows
CWL implementations which also implement SubworkflowFeatureRequirement can support nesting
            workflows as a step within others. Complex workflows with individual components which can be abstracted
            should utilise this to make their workflow modular and allow sections of them to be easily reused
Extracting subworkflows enables them to be run, developed on and tested individually. It also makes them able to be understood more easily
Subworkflows are simplified in the visualisations and are linked as a different workflow in the
            Step tables on each workflow page
Attribution
Include attribution information in your workflow and tool descriptions
                For example, to attribute a person as the author of a workflow or tool with name, email and
                ORCID information, include the following statements at the top level:
$namespaces: { s: "http://schema.org/" }
s:author:
- class: s:Person
  s:name: Mark Robinson
  s:email: mailto:mark@example.com
  s:id: http://orcid.org/0000-0002-8184-7507
                For attributing organisations, see this workflow
                as an example
            
            Attribution information allows your workflows and tooling to be used by others while recognising your contributions. The inclusion of an ORCID allows you to be uniquely identified from other researchers
CWLViewer parses attribution information for inclusion in the Research Object Manifest from both the Git commit logs and from the CWL descriptions themselves when expressed in the http://schema.org/author format as above
Licensing
Include a OSI approved open source license in your workflow and tool descriptions
                For example, the following two statements at the top level of a workflow or tool description licenses it
                under the Apache V2.0 License:
$namespaces: { s: "http://schema.org/" }
s:license: "https://www.apache.org/licenses/LICENSE-2.0"
            
            
            A permissive open source license allows others to remix and use your tooling and workflows to prevent the community from repeating development effort, allowing everyone to benefit
CWL Viewer is designed to allow people to locate and make use of the workflows developed by others as well as to share and demonstrate work, and open source licenses promote this goal
Limitations
Research Objects
Research Objects are constructed from the containing directory of the workflow file. This means tooling external to the directory but used by the workflow will not be included (see Github issue)
We recommend that you keep all files in the containing folder for current use of CWL Viewer
SSH Cloning
SSH URLs are not able to be cloned or used as submodules due to the need for SSH keys to be set up.
We do not plan SSH support due to the impact on reproducibility from this being made a required step to download the workflow.
Others
Other limitations or unimplemented features can be viewed on the Github issues page



