Skip to content

Issue with preparing TxDB and GRanges object from GFF file #85

@benjytan88

Description

@benjytan88

Hi,
I am having some trouble making a gene annotation object (TxDB or GRanges) for CHM13 from the GFF file.
When making the TxDB object, I got the following warning message:

> txdb <- getTxDb(organism = "Homo sapiens", file = "./ref_genome/chm13v2.0_RefSeq_Liftoff_v5.1.gff3")
Import genomic features from the file as a GRanges object ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK
Warning messages:
1: In .extract_transcripts_from_GRanges(tx_IDX, gr, mcols0$type, mcols0$ID,  :
  some transcripts have no "transcript_id" attribute ==> their name ("tx_name" column
  in the TxDb object) was set to NA
2: In .extract_transcripts_from_GRanges(tx_IDX, gr, mcols0$type, mcols0$ID,  :
  the transcript names ("tx_name" column in the TxDb object) imported from the
  "transcript_id" attribute are not unique
3: In .find_exon_cds(exons, cds) :
  The following transcripts have exons that contain more than one CDS (only the first
  CDS was kept for each exon): NM_001134939.1, NM_001172437.2, NM_001184961.1,
  NM_001301020.1, NM_001301302.1, NM_001301371.1, NM_002537.3, NM_004152.3,
  NM_015068.3, NM_016178.2

Is this normal? I also noticed that there are quite a lot of NAs in the table and was having difficulties converting it to a GRanges object.

I think the annotation would be very useful to everyone else, so would it be possible for your team to host a TxDB / GRanges object of the annotation here?

Thank you very much.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions