Extracting the download link from the feed #10

0x5c · 2019-10-27T14:58:54Z

So it isn't hard-coded anymore

classabbyamp · 2019-12-06T17:55:17Z

not sure how to not hard-code this. the download link isn't included in the RSS feed.

<item>
    <title>Big CTY &#x2013; 20 November 2019</title>
    <link>
        http://www.country-files.com/big-cty-20-november-2019/
    </link>
    <pubDate>Wed, 20 Nov 2019 15:35:19 +0000</pubDate>
    <creator>
        <![CDATA[ AD1C ]]>
    </creator>
    <category>
        <![CDATA[ Big CTY ]]>
    </category>
    <guid isPermaLink="false">https://www.country-files.com/?p=1359</guid>
    <description>
        <![CDATA[
        Version entity is East Malaysia, 9M6 [download] Added/changed Entities/Prefixes/Callsigns: 3DA0BP/J is Kingdom of eSwatini, 3DA A41HA/ND is Oman, A4 B7/BA7CK, B7/BA7NQ, B7/BD1TX, B7CRA, BA7CK, BA7IA, BD1TX and BD7HC are all China, BY in CQ zone 26 VP8HAL is Antarctica, CE9 &#8230; <a href="http://www.country-files.com/big-cty-20-november-2019/">Continue reading <span class="meta-nav">&#8594;</span></a>
        ]]>
    </description>
</item>

mbridak · 2023-06-03T17:01:56Z

You can extract the link via:

"""Get URL to new bigcty file"""

import feedparser
import requests
from lxml import html

DEFAULT_FEED = "http://www.country-files.com/category/big-cty/feed/"

feed = requests.get(DEFAULT_FEED, timeout=15)
parsed_feed = feedparser.parse(feed.content)
update_url = parsed_feed.entries[0]["link"]

page = requests.get(update_url, timeout=15)
tree = html.fromstring(page.content)
link = tree.xpath("//a[contains(@href,'zip')]/@href")[0]

print(link)

which today spits out:
https://www.country-files.com/bigcty/download/2023/bigcty-20230526.zip

classabbyamp · 2023-06-03T19:50:15Z

nice! if you'd like to make a PR for this, feel free :)

0x5c added enhancement New feature or request help wanted Extra attention is needed priority labels Oct 27, 2019

0x5c removed the priority label Nov 12, 2019

0x5c mentioned this issue Nov 18, 2019

Remove hard-coded paths #3

Closed

mbridak mentioned this issue Jun 4, 2023

Dynamically extract the download URL from the feed #47

Closed

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extracting the download link from the feed #10

Extracting the download link from the feed #10

0x5c commented Oct 27, 2019

classabbyamp commented Dec 6, 2019

mbridak commented Jun 3, 2023

classabbyamp commented Jun 3, 2023

Extracting the download link from the feed #10

Extracting the download link from the feed #10

Comments

0x5c commented Oct 27, 2019

classabbyamp commented Dec 6, 2019

mbridak commented Jun 3, 2023

classabbyamp commented Jun 3, 2023