Skip to content

Commit

Permalink
ascl/ads bibcode comparisons
Browse files Browse the repository at this point in the history
  • Loading branch information
siddharm committed Nov 4, 2019
1 parent 03e514f commit 4c97503
Show file tree
Hide file tree
Showing 7 changed files with 8,191 additions and 0 deletions.
24 changes: 24 additions & 0 deletions ascl-ads-comparison/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
much of this is ad-hoc, so there are a couple of manual steps required to reproduce
the intended result

1. include a .ads\_key file with the text of your api key in the folder above this one

2. the file ascl\_codes must be obtained through a wget or lynx download, with
some manipulation to list every code line by line

this is what I did:
* edit /etc/lynx/lynx.cfg to have persistent cookies
* log in to ascl.net/adm with lynx
* lynx -width=999 -dump -nolist -nomargins http://ascl.net/code/utility/ascl2 | awk -F: '{if (NF==2) { print "#",$0} else {print $0} }' > ascl2a.txt
* sed -i s/\#//g ascl2a.txt
* awk -F: '{print $2}' ascl2a.txt | awk '{printf("ascl.%s %s\n",$2,$3)}' > ascl2b.txt
* awk '{print $1}' ascl2b.txt > ascl\_codes



3. the file ads\_codes is obtained by running ads\_checker.py

4. code\_comparison looks at ads\_codes and checks if each of the lines appears
in ascl\_codes


67 changes: 67 additions & 0 deletions ascl-ads-comparison/ads_checker.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
import sys
import os
import json
import requests

'''
REQUIRES 1 FILE IN YOUR DIRECTORY:
- .ads_key : text of your ads api key
'''


f = open(os.path.join(sys.path[0], "../.ads_key"), "r")
key = f.read()
#print("your key is: " + key + "\n")

key = key.strip()

#os.system("cat .ads_key")


headers = {
"Authorization": "Bearer:" + key,
}

def check_pages(num):
return_list = []
curr_result = 0
for k in range(num):
#print("starting at resutlt " + str(curr_result))

params = (
('q', 'ascl'),
('fl', 'bibcode'),
('rows', "3000"),
('start', str(curr_result))
)

response = requests.get('https://api.adsabs.harvard.edu/v1/search/query', headers=headers, params=params)
#print(response)
data = response.json();
num_results = data["response"]["numFound"]

curr_result = curr_result + num_results

#print(str(num_results) + " total found on page " + str(k))

new_rel = []
for i in data["response"]["docs"]:
bc = i['bibcode']
if "ascl.soft" in bc:
new_rel.append(bc)

#print(str(len(new_rel)) + " relevant found on page " + str(k))

return_list.extend(new_rel)
return_list.sort()
return return_list



lst = check_pages(3)

f = open(os.path.join(sys.path[0], "ads_codes"), "w+")

for i in lst:
f.write(i + "\n")

Loading

0 comments on commit 4c97503

Please sign in to comment.