feature
package feature
Provides classes to represent genomic features, i.e., annotations with block coordinates, an orientation, and other associated information.
The main classes to use are feature.GenericFeature, feature.Transcript, and feature.MessengerRNA. These can be instantiated directly or by using a feature.FeatureBuilder.
In addition, genomic annotations with no associated information can be represented with feature.Block, a single contiguous block, and feature.BlockSet, an annotation consisting of multiple blocks.
- Alphabetic
- By Inheritance
- feature
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Type Members
-
final
case class
Block
(chr: String, start: Int, end: Int, orientation: Orientation) extends Region with Product with Serializable
A single contiguous block on a chromosome with an Orientation.
A single contiguous block on a chromosome with an Orientation.
- chr
Chromosome name. Cannot start with "chr" (otherwise an IllegalArgumentException is thrown). All code in this library automatically strips "chr" from feature chromosome names before instantiating Blocks. For example, format.GTF22Record strips "chr" from GTF2.2 lines before creating Features. Client code that implements new ways to create features will need to do this as well.
- start
Zero-based start position (inclusive)
- end
Zero-based end position (exclusive)
- orientation
Block orientation
-
final
case class
BlockSet
(blocks: List[Block]) extends Region with Product with Serializable
A collection of non-overlapping Blocks on the same chromosome with the same Orientation
A collection of non-overlapping Blocks on the same chromosome with the same Orientation
A BlockSet can represent the structure of, e.g., a spliced RNA transcript.
A BlockSet must have at least two Blocks. For Regions with one Block, use a Block instead. The list of Blocks passed to the constructor must be non-overlapping and on the same chromosome, and have the same Orientation. They must be in ascending order of chromosome position.
- blocks
Nonempty list of Blocks
-
sealed abstract
class
Feature
extends Ordered[Feature]
A genomic feature.
A genomic feature.
Includes a non-empty underlying Region and an optional feature name.
-
final
class
FeatureBuilder
extends AnyRef
A builder for Features.
A builder for Features.
Feature properties such as Blocks, name, gene name, and CDS start and end can be added one by one. The appropriate type in the Feature hierarchy is determined automatically based on the properties that have been added.
The Feature is obtained by calling get.
The following are the valid combinations of provided properties and the types they give rise to.
One or more blocks, optional feature name, no gene name, no CDS start, no CDS end -> GenericFeature
One or more blocks, optional feature name, a gene name, no CDS start, no CDS end -> Transcript
One or more blocks, optional feature name, optional gene name, a CDS start, a CDS end -> MessengerRNA. If the CDS is invalid (length < 6 or length not divisible by 3) a Transcript is returned instead.
Any other combination of properties results in an IllegalArgumentException when calling get.
-
trait
FeatureBuilderModifier
extends AnyRef
A way to encapsulate arbitrary operations on a FeatureBuilder.
A way to encapsulate arbitrary operations on a FeatureBuilder.
The op function performs the operations on a specified FeatureBuilder and returns a new modified FeatureBuilder.
Classes that mix in this trait include format.GTF22Record.
-
final
case class
Gene
(transcripts: Set[Transcript], name: String) extends Product with Serializable
A gene giving rise to one or more Transcripts.
A gene giving rise to one or more Transcripts.
- transcripts
Non-empty Set of Transcripts
- name
Gene name
-
class
GenericFeature
extends Feature
A Feature consisting of an underlying Region and an optional name.
-
final
case class
MessengerRNA
(blocks: Region, cdsStart: Int, cdsEnd: Int, name: Option[String], geneId: Option[String]) extends Transcript with Product with Serializable
A representation of a spliced messenger RNA.
A representation of a spliced messenger RNA.
A MessengerRNA has an underlying Region that specifies the chromosome, Blocks, and Orientation. In addition, a MessengerRNA has an optional name and parent gene name. Finally, it also has CDS start and end positions.
- blocks
Non-empty underlying Region
- cdsStart
Zero-based CDS start position (inclusive). The CDS start is the smallest (leftmost) position of the CDS, regardless of transcript Orientation. In other words, if Orientation is Plus, the CDS start is the first position of the start codon. If Orientation is Minus, the CDS start is the 3'-most position of the stop codon. The CDS start must lie within one of the Blocks. The CDS must be at least 6 nucleotides in length (after splicing) and its length must be a multiple of 3. Otherwise, an IllegalArgumentException is thrown.
- cdsEnd
Zero-based CDS end position (exclusive). The CDS end is one plus the largest (rightmost) position of the CDS, regardless of transcript Orientation. In other words, if Orientation is Plus, the CDS end is one plus the last position of the stop codon. If Orientation is Minus, the CDS end is the 5'-most position of the start codon. The CDS end must lie within one of the Blocks or be equal to one plus a Block end position. The CDS must be at least 6 nucleotides in length (after splicing) and its length must be a multiple of 3. Otherwise, an IllegalArgumentException is thrown.
- name
Optional feature name. Do not pass Some(""); use None in that case.
- geneId
Optional parent gene name. Do not pass Some(""); use None in that case.
-
sealed
trait
Orientation
extends AnyRef
An orientation for a genomic feature.
An orientation for a genomic feature.
Can refer to a particular DNA strand if the feature is associated with a strand, or unstranded if not.
-
sealed abstract
class
Region
extends Ordered[Region]
A genomic region.
A genomic region.
Can include zero, one, or multiple blocks on a chromosome. Includes an orientation. Does not include any further information.
-
sealed
class
Transcript
extends GenericFeature
A representation of a spliced transcript.
A representation of a spliced transcript.
A Transcript has an underlying Region that specifies the chromosome, Blocks, and Orientation. In addition, a Transcript has an optional name and parent gene name.
Value Members
-
object
Block
extends Serializable
Companion functions for Block
-
object
Empty
extends Region with Product with Serializable
An empty region.
An empty region. Empty has no chromosome, Blocks, or Orientation.
-
object
Exceptions
Exceptions for invalid Features
-
object
Minus
extends Orientation
The minus strand of DNA
-
object
Orientation
Utility methods for calculations on Orientations
-
object
Plus
extends Orientation
The plus strand of DNA
-
object
Unstranded
extends Orientation
Unstranded - not associated with a strand or strand unknown