|
ImportingEntrezSummary
How to import Entrez Summary in order to get more info for the genes
IntroductionEntrez Summary is available from http://www.ncbi.nlm.nih.gov/sites/entrez DetailsA ruby script was used. It reads a file "geneIDs.txt" that is generated from a simple generif query: select distinct(geneID) from generif order by geneid; The package REXML was used to parse xml. #!/usr/bin/env ruby
require 'net/http'
require 'rexml/document'
include REXML
host = "eutils.ncbi.nlm.nih.gov"
path = "/entrez/eutils/efetch.fcgi?db=gene&tool=genequad&retmode=xml&id="
http = Net::HTTP.new(host)
myIDs = File.open("geneIDs.txt").read
myIDs.each{ |line|
id = line.strip
response, = http.get(path + id)
result = response.body
doc = Document.new(result)
root = doc.root
geneID = root.elements[1].elements[1].elements[1].elements[1].to_a[0]
description = root.elements[1].elements[6].to_a[0]
puts "GeneID"
puts geneID
puts description
}This .cpp was used to get the results. In the future, it will insert the data in a table in the database. #include <stdio.h>
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main(){
ifstream in("temp.txt");
string s;
int count=0;
int fileID=0;
getline(in,s);
while(s=="GeneID"){
getline(in,s);
printf("GeneId = %s\n",s.c_str());
getline(in,s);
printf("Description = %s\n\n",s.c_str());
while(getline(in,s)){
if(s=="GeneID") break;
}
}
}
|
Sign in to add a comment