diff options
Diffstat (limited to 'brain.documentation')
| -rw-r--r-- | brain.documentation/QUICKSTART.rst | 26 | 
1 files changed, 24 insertions, 2 deletions
| diff --git a/brain.documentation/QUICKSTART.rst b/brain.documentation/QUICKSTART.rst index 8a862df..c63f358 100644 --- a/brain.documentation/QUICKSTART.rst +++ b/brain.documentation/QUICKSTART.rst @@ -150,10 +150,32 @@ Use *cron job* to schedule running these two scripts on December 25 and December  download_ena_records.py will download the following files to folder Data/information:  - ena_read_run.xml +  + run accession +  + title +  + experiment accession +  - ena_read_experiment.xml -- ena_sample.xml -- ema_study.xml +  + experiment accession (e.g., SRX19576837) +  + study accession (e.g., SRP425891) +  + sample descriptor (e.g., SRS16960021) +  + title +  + library strategy (e.g., RNA-Seq) +  + library source (e.g., TRANSCRIPTOMIC) +- ena_sample.xml +  + sample accession, e.g., SRS17013367 +  + taxon id (e.g., 3702) +  + title + +- ena_study.xml +  + study accession, e.g., SRP425891 +  + study abstract + +The above four files can be linked through accessions. +Given a run accession in ena_read_run.xml, we can get its associated experiment accession. +Given an experiment accession in ena_read_experiment.xml, we can get its associated sample accession and study accession. +Given a sample accession, we can find its taxon id in ena_sample.xml. +Given a study accession, we can find its abstract in ena_study.xml.  parse_end_xlm.py will use the above four XML files as input, and output the following two files to Data/information:  - rnaseq_info_database.json (containing run IDs, library_source and library_strategy) | 
