Python script which turns a Tweet Archive into a single, nicely-formatted Markdown file. Allows some (basic) custom formatting options.
The script reads tweets in chronological order, and the output file will contain the oldest tweet at the top.1
This can be done by going to the Settings page on the Twitter website, and clicking Request your archive. You'll get an email containing a download link for your archive. You'll need to unzip it once it's downloaded.
At the top of tweet_archive_logger.py, there are five options you need to change for the script to work for you:
usernameis your Twitter username, without the @ symbol.csv_pathis the file path of the CSV file inside the tweet archive folder.output_pathis the file path to the file you want to save the Markdown output into.tweet_formatis the format each tweet should be in; see below for more info.date_formatis the format timestamps should be changed into; see below for more info.
This is simply a string which gets replicated for each tweet in your archive. There are a few placeholders you can use which get replaced with data from the tweet:
idis the long id number that Twitter gives each tweet.dateis the date of the tweet, formatted according todate_format.textis the actual text of the tweet.urlis the URL which points to the tweet on the Twitter website. When you use a placeholder insidetweet_format, it must be surrounded with double curly braces.
Here is an example of tweet_format:
{{text}}
[{{date}}]({{url}})
---
The above formats a tweet like this:
Using tables for layout like it's 1996.
[March 31, 2014 at 09:36PM](http://twitter.com/jobbogamer/status/450748504517115905)
---
This is simply a string which gets replicated when you use the {{date}} placeholder. The date string itself is made up of a number of placeholders:
yearis the four digit year, e.g.2014.month_numis the month number with a leading zero,01through12.monthis the month name,JanuarythroughDecember.month_shortis the first three letters of the month name,JanthroughDec.dayis the day of the month with a leading zero,01through31.houris the hour in 24-hour format with a leading zero,00through23.hour_12is the hour in 12-hour format with a leading zero,01through12.minuteis the minute with a leading zero,00through59.secondis the second with a leading zero,00through59.amis the AM/PM qualifier, eitherAMorPM. When you use a placeholder insidedate_format, it must be surrounded with double curly braces.
Here is an example of date_format:
{{day}} {{month}}, {{year}} at {{hour}}:{{minute}}{{am}}
The above formats a date like this:
March 31, 2014 at 09:36PM
Now just run the script like you would any other Python script, either through your IDE or from the command line.
Feel free to fork, change things, and issue a pull request. This isn't a very robust script, and there aren't many formatting options at the moment.
1: This is done because I used this script to start a tweet log that gets appended to by ifttt. Since ifttt can only append to files at the bottom, the tweets had to be oldest first.