diff options
author | Hui Lan <lanhui@zjnu.edu.cn> | 2020-02-11 18:03:52 +0800 |
---|---|---|
committer | Hui Lan <lanhui@zjnu.edu.cn> | 2020-02-11 18:03:52 +0800 |
commit | cc6858ae8e1a3eb24f68047ec4009b0bb9ab10ad (patch) | |
tree | 89562dbe9a7c1180939ec6ee3e04336f23d77230 /Code | |
parent | 02685414bd26fc446eb08152506f34ff2a642dfe (diff) |
merge_edges.py: make a better key
Use a combination of target gene ID and tf gene ID as a key. So if we having the following:
Target: AT5G09445 AT5G09445
TF: AT1G53910 RAP2.12
Then the key will be "AT5G09445_AT1G53910".
Before it was "AT5G09445 AT5G09445 AT1G53910 RAP2.12". This is OK in most cases, as long a gene ID's corresponding gene name is consistent. But if "AT1G53910" has a different gene
name, then we will have a DIFFERENT key, which is not what we want.
Diffstat (limited to 'Code')
-rw-r--r-- | Code/merge_edges.py | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/Code/merge_edges.py b/Code/merge_edges.py index 84535d7..62a958a 100644 --- a/Code/merge_edges.py +++ b/Code/merge_edges.py @@ -160,7 +160,7 @@ for fname in sorted(glob.glob(os.path.join(EDGE_POOL_DIR, 'edges*.*'))): strength = lst[8] method_or_tissue = lst[9] - key = target + tf + key = target.split()[0] + '_' + tf.split()[0] # target or tf has two fields, Gene ID and Gene Name, split()[0] means using Gene ID only. t = (target, tf, score, type_of_score, rids, cids, ll, date, strength, method_or_tissue) if not key in d: |