Try a random time into the hour, like 27 or 33 or whatever, or poll at an interval like 693 minutes rather than 600, so that only rarely do you poll sites near the hour boundry. avoid updates on the hour or half hour.don't request hourly updates unless you need them. try and retrieve things a few times a day or week.But if you're compiling parts of a web page, then you want to generate a response within 20 seconds, not 3 minutes. This makes sense if you're just writing a client aggregator for reading blogs. If you have less than a few hundred feeds an hour to retrieve, one a time is probably better - why peak out your processor/bandwidth? Usually, such performance is not necessary, unless you have thousands of feeds to retrieve every hour.That said, RawDog is Python, and it is using Feed Parser, so I've linked it at the bottom of the page. I imagine that there are other good reasons to write aggregating code. In particular: I wrote the code because I needed a MoinMoin macro that aggregated RSS feeds. That's not the case there are good reasons to write aggregators yet. RawDog is a ready made aggregator if you don't want to write your own.Īre you concerned that I'm encouraging people to reduplicate efforts, making aggregator after aggregator after aggregator? I'm moving the following out of the main text: Right now (feedparser 3.3), it goes into the "rdf_value" attribute of the entry. Getting the "author"/"contributor" out of most ModWiki RSS feeds with the feedparser module is a bit confusing as of now. RawDog is an RSS aggregator written in Python & using Feed Parserįeedjack Planet like Feed aggregator using Universal Feed Parser and the django webframework reverse() # for most recent entries firstĬongratulations! You've aggregated a bunch of changes! 1 sorted_entries = sorted( entries, key= lambda entry: entry) 2 sorted_entries.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |