Flashback is a Python scraper for the Swedish bulletin board site flashback.org. The scraper does one thing, and it does it well.
>>> import flashback
>>> thread = flashback.get('https://www.flashback.org/t2629982')
>>> thread.title
u'Min hund morrar åt mig'
>>> len(thread)
47
Use pip to install Flashback.
$ pip install flashback
url = 'https://www.flashback.org/<some-url>'
thread = flashback.get(url)
Each thread has the following attributes:
- thread_name
- thread_id
- section_id
- section_name
Each post in the thread is an object with the following attributes:
- id
- user_name
- user_id
- timestamp
- content
for post in thread:
print post.content
thread[23]
Returns a summary of the thread's basic facts, like most common authors.
thread.describe()
Supported output formats are CSV and JSON.
thread.to_csv('some_discussion.csv')
thread.to_json('some_discussion.json')
Some posts contain specially formatted HTML elements, such as quote containers and hidden spoilers. These HTML tags are parsed and formatted using a simple notation.
Quotes begin with [FQ]
and end with [EFQ]
.
[FQ]This is a quote in a post[EFQ]
This is the actual content of the post.
Spoilers begin with [FSP]
and end with [EFP]
.
[FSP]This is a hidden spoiler[EFP]
This is normal text content.
Flashback prepends all external links with https://www.flashback.org/leave.php?u=<some-url>
which is a transit page users have to click through to get out of Flashback. In parsing these links, the transit page part of the href is removed.
The parsed link contains the original href value and the original text content. In cases where the link has just been pasted into the post, these two values are identical. The parsed link is wrapped in [FA]
and [EFA]
.
Original link:
<a href="http://example.com">Look at this</a>
Link in Flashback post:
<a href="https://www.flashback.org/leave.php?u=http://example.com">Look at this</a>
Parsed link result:
[FA]http://example.com Look at this[EFA]
Timestamps are converted to naive Python datetime objects, and represent the Europe/Stockholm
timezone.