summaryrefslogtreecommitdiff
path: root/brain.documentation/QUICKSTART.rst
diff options
context:
space:
mode:
Diffstat (limited to 'brain.documentation/QUICKSTART.rst')
-rw-r--r--brain.documentation/QUICKSTART.rst26
1 files changed, 24 insertions, 2 deletions
diff --git a/brain.documentation/QUICKSTART.rst b/brain.documentation/QUICKSTART.rst
index 8a862df..c63f358 100644
--- a/brain.documentation/QUICKSTART.rst
+++ b/brain.documentation/QUICKSTART.rst
@@ -150,10 +150,32 @@ Use *cron job* to schedule running these two scripts on December 25 and December
download_ena_records.py will download the following files to folder Data/information:
- ena_read_run.xml
+ + run accession
+ + title
+ + experiment accession
+
- ena_read_experiment.xml
-- ena_sample.xml
-- ema_study.xml
+ + experiment accession (e.g., SRX19576837)
+ + study accession (e.g., SRP425891)
+ + sample descriptor (e.g., SRS16960021)
+ + title
+ + library strategy (e.g., RNA-Seq)
+ + library source (e.g., TRANSCRIPTOMIC)
+- ena_sample.xml
+ + sample accession, e.g., SRS17013367
+ + taxon id (e.g., 3702)
+ + title
+
+- ena_study.xml
+ + study accession, e.g., SRP425891
+ + study abstract
+
+The above four files can be linked through accessions.
+Given a run accession in ena_read_run.xml, we can get its associated experiment accession.
+Given an experiment accession in ena_read_experiment.xml, we can get its associated sample accession and study accession.
+Given a sample accession, we can find its taxon id in ena_sample.xml.
+Given a study accession, we can find its abstract in ena_study.xml.
parse_end_xlm.py will use the above four XML files as input, and output the following two files to Data/information:
- rnaseq_info_database.json (containing run IDs, library_source and library_strategy)