Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extracting the download link from the feed #10

Open
0x5c opened this issue Oct 27, 2019 · 3 comments
Open

Extracting the download link from the feed #10

0x5c opened this issue Oct 27, 2019 · 3 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@0x5c
Copy link
Member

0x5c commented Oct 27, 2019

So it isn't hard-coded anymore

@0x5c 0x5c added enhancement New feature or request help wanted Extra attention is needed priority labels Oct 27, 2019
@0x5c 0x5c removed the priority label Nov 12, 2019
@classabbyamp
Copy link
Member

not sure how to not hard-code this. the download link isn't included in the RSS feed.

<item>
    <title>Big CTY &#x2013; 20 November 2019</title>
    <link>
        http://www.country-files.com/big-cty-20-november-2019/
    </link>
    <pubDate>Wed, 20 Nov 2019 15:35:19 +0000</pubDate>
    <creator>
        <![CDATA[ AD1C ]]>
    </creator>
    <category>
        <![CDATA[ Big CTY ]]>
    </category>
    <guid isPermaLink="false">https://www.country-files.com/?p=1359</guid>
    <description>
        <![CDATA[
        Version entity is East Malaysia, 9M6 [download] Added/changed Entities/Prefixes/Callsigns: 3DA0BP/J is Kingdom of eSwatini, 3DA A41HA/ND is Oman, A4 B7/BA7CK, B7/BA7NQ, B7/BD1TX, B7CRA, BA7CK, BA7IA, BD1TX and BD7HC are all China, BY in CQ zone 26 VP8HAL is Antarctica, CE9 &#8230; <a href="http://www.country-files.com/big-cty-20-november-2019/">Continue reading <span class="meta-nav">&#8594;</span></a>
        ]]>
    </description>
</item>

@mbridak
Copy link

mbridak commented Jun 3, 2023

You can extract the link via:

"""Get URL to new bigcty file"""

import feedparser
import requests
from lxml import html

DEFAULT_FEED = "http://www.country-files.com/category/big-cty/feed/"

feed = requests.get(DEFAULT_FEED, timeout=15)
parsed_feed = feedparser.parse(feed.content)
update_url = parsed_feed.entries[0]["link"]

page = requests.get(update_url, timeout=15)
tree = html.fromstring(page.content)
link = tree.xpath("//a[contains(@href,'zip')]/@href")[0]

print(link)

which today spits out:
https://www.country-files.com/bigcty/download/2023/bigcty-20230526.zip

@classabbyamp
Copy link
Member

nice! if you'd like to make a PR for this, feel free :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants