53 Commits
0.3 ... 0.7.0

Author SHA1 Message Date
KnugiHK
abf4b20bc6 Merge branch 'main' of https://github.com/KnugiHK/Whatsapp-Chat-Exporter 2021-12-28 19:44:00 +08:00
KnugiHK
f2f6258960 Bump version number 2021-12-28 19:38:01 +08:00
Knugi
62af48c78e Update README.md 2021-12-28 11:34:23 +00:00
Knugi
c9158d202d Update To do 2021-12-28 11:27:29 +00:00
KnugiHK
fb5e4d5421 Merge branch 'dev' 2021-12-28 19:26:02 +08:00
Knugi
d85c91fbdc Update README.md 2021-11-18 03:09:03 +00:00
KnugiHK
0dddb63c5e Merge branch 'main' into dev 2021-08-13 20:28:49 +08:00
KnugiHK
dd36960ecb Implement CSS for metadata 2021-08-13 20:25:59 +08:00
Knugi
620a1bcdb7 Update README.md 2021-07-13 10:54:07 +00:00
Knugi
896a6d2ddd Update README.md 2021-07-13 10:53:41 +00:00
KnugiHK
4ee92e7efc Bug fix
The directory cannot be created if the parent directory is not present
2021-07-11 18:24:59 +08:00
KnugiHK
f91aac1e11 Implementing newline to <br> 2021-07-11 18:17:06 +08:00
KnugiHK
27a6ff98b3 Merge branch 'main' into dev 2021-07-11 11:05:41 +08:00
Knugi
1952c0835c Update README.md 2021-07-11 03:05:15 +00:00
KnugiHK
ab42cad166 Some PEP8 2021-07-10 22:01:04 +08:00
KnugiHK
3ed59ee051 Bug fix 2021-07-10 21:53:28 +08:00
KnugiHK
f9358ded14 Update README.md 2021-07-10 21:50:30 +08:00
KnugiHK
790f4ec5e0 Support custom template 2021-07-10 21:46:45 +08:00
KnugiHK
35ef4031fc Support crypt12 2021-07-10 21:28:49 +08:00
KnugiHK
b9f343cf2f Update README and setup.py 2021-07-10 21:11:49 +08:00
KnugiHK
18ee152688 Support crypt14 WhatsApp Backup 2021-07-10 21:08:52 +08:00
Knugi
620e89a185 Update README.md 2021-07-06 03:17:57 +00:00
KnugiHK
3ada8916f9 Merge branch 'main' of https://github.com/KnugiHK/Whatsapp-Chat-Exporter into main 2021-05-31 16:13:07 +08:00
Knugi
07ebb692e5 Create CNAME 2021-05-31 08:05:44 +00:00
Knugi
7255f0fe2b Set theme jekyll-theme-cayman 2021-05-31 08:04:21 +00:00
KnugiHK
684badb9a6 Update the string appear in wtsexporter --version 2021-05-31 11:09:47 +08:00
Knugi
e1221d9f59 Update README.md 2021-05-31 03:04:07 +00:00
KnugiHK
fc84e430ed Update setup.py 2021-05-31 10:59:29 +08:00
KnugiHK
0b0af518c3 Update setup.py 2021-05-31 10:54:35 +08:00
KnugiHK
7e84595074 Update setup.py 2021-05-31 10:49:48 +08:00
KnugiHK
1bdd5fe6df Update extract_iphone_media.py 2021-05-31 10:47:40 +08:00
KnugiHK
0ac7eecb47 Update version number for Pypi 2021-05-31 10:44:04 +08:00
KnugiHK
b0a469509c Update README.md 2021-05-31 10:43:49 +08:00
Knugi
cec68dd3a0 Update README.md 2021-05-12 06:11:24 +00:00
Knugi
46e12daa6a Update README.md 2021-05-12 05:51:27 +00:00
Knugi
4dd7f4e753 Remove a to-do task 2021-05-12 05:48:43 +00:00
Knugi
6cf6e50db9 Update README.md 2021-05-12 05:47:02 +00:00
KnugiHK
366d18b678 Create android_structure.png 2021-05-12 13:44:46 +08:00
Knugi
e8a8546a13 Update README.md 2021-05-12 05:44:32 +00:00
KnugiHK
baa733df5f Create old_README.md 2021-05-12 13:24:54 +08:00
KnugiHK
d9f38fc714 Create console script 2021-05-10 15:51:43 +08:00
KnugiHK
322281a8ec Prepare for publishing in PyPi 2021-05-10 15:51:30 +08:00
KnugiHK
3ee40ecda4 Move images to a folder 2021-05-10 15:49:41 +08:00
Knugi
f591c7a57f Update README.md 2021-05-10 05:48:25 +00:00
KnugiHK
e1e49261aa Merge branch 'main' of https://github.com/KnugiHK/Whatsapp-Chat-Exporter into main 2021-05-10 13:45:37 +08:00
KnugiHK
b58aaa8f73 Resize images 2021-05-10 13:45:30 +08:00
Knugi
c62f08340e Update README.md 2021-05-10 05:40:28 +00:00
Knugi
4fb759f974 Update README.md 2021-05-10 05:39:59 +00:00
KnugiHK
cb2e83721e Merge branch 'main' of https://github.com/KnugiHK/Whatsapp-Chat-Exporter into main 2021-05-10 13:36:56 +08:00
KnugiHK
be7e20317d Add support for encrypted iPhone backup 2021-05-10 13:33:13 +08:00
Knugi
cf2e69b594 Update README.md 2021-05-10 04:54:04 +00:00
Knugi
fb33a883e6 Update README.md 2021-05-10 04:47:52 +00:00
Knugi
58bc8634f7 Update README.md 2021-05-09 05:40:24 +00:00
19 changed files with 782 additions and 267 deletions

1
CNAME Normal file
View File

@@ -0,0 +1 @@
wts.knugi.com

113
README.md
View File

@@ -1,45 +1,106 @@
# Whatsapp-Chat-Exporter
A Whatsapp database parser that will give you the history of your Whatsapp conversation in HTML and JSON
[![Latest in Pypi](https://img.shields.io/pypi/v/whatsapp-chat-exporter?label=Latest%20in%20Pypi)](https://pypi.org/project/whatsapp-chat-exporter/)
![License MIT](https://img.shields.io/pypi/l/whatsapp-chat-exporter)
[![Python](https://img.shields.io/pypi/pyversions/Whatsapp-Chat-Exporter)](https://pypi.org/project/Whatsapp-Chat-Exporter/)
A customizable Android and iPhone Whatsapp database parser that will give you the history of your Whatsapp conversations in HTML and JSON.
**If you plan to uninstall WhatsApp or delete your WhatsApp account, please make a backup of your WhatsApp database. You may want to use this exporter again on the same database in the future as the exporter develops**
# Usage
First, clone this repo, and copy all py and html files to a working directory if you want to do so.
**If you want to use the old release (< 0.5) of the exporter, please follow the [old usage guide](https://github.com/KnugiHK/Whatsapp-Chat-Exporter/blob/main/old_README.md#usage)**
First, install the exporter by:
```shell
git clone https://github.com/KnugiHK/Whatsapp-Chat-Exporter.git
pip install whatsapp-chat-exporter
pip install whatsapp-chat-exporter[android_backup] & :: Optional, if you want it to support decrypting Android WhatsApp backup.
```
Then, ready your WhatsApp database, place them in the root of working directory.
* For Android, it is called msgstore.db. If you want name of your contacts, get the contact database, which is called wa.db.
* For iPhone, it is called 7c7fba66680ef796b916b067077cc246adacf01d (YES, a hash).
Then, create a working directory in somewhere you want
```shell
mkdir working_wts
cd working_wts
```
## Working with Android
### Unencrypted WhatsApp database
Extract the WhatsApp database with whatever means, one possible means is to use the [WhatsApp-Key-DB-Extractor](https://github.com/KnugiHK/WhatsApp-Key-DB-Extractor)
Next, ready your media folder, place it in the root of working directory.
* For Android, copy the WhatsApp directory from your phone directly.
* For iPhone, run the extract_iphone_media.py, and you will get a folder called Message. Please note that, this script does not support encrypted backup.
```
python extract_iphone_media.py "C:\Users\[Username]\AppData\Roaming\Apple Computer\MobileSync\Backup\[device id]"
```
And now, you should have something like this:
After you obtain your WhatsApp databse, copy the WhatsApp database and media folder to the working directory. The database is called msgstore.db. If you also want the name of your contacts, get the contact database, which is called wa.db. And copy the WhatsApp (Media) directory from your phone directly.
![Folder structure](structure.png)
And now, you should have something like this in the working directory.
Last, run the script regarding the type of phone.
![Android folder structure](imgs/android_structure.png)
#### Extracting
Simply invoke the following command from shell.
```sh
wtsexporter -a
```
python extract.py & :: Android
python extract_iphone.py & :: iPhone
### Encrypted Android WhatsApp Backup
In order to support the decryption, install pycryptodome if it is not installed
```sh
pip install pycryptodome
```
And you will get these:
Place the decryption key file (key) and the encrypted WhatsApp Backup (msgstore.db.crypt14) in the working directory. If you also want the name of your contacts, get the contact database, which is called wa.db. And copy the WhatsApp (Media) directory from your phone directly.
And now, you should have something like this in the working directory.
![Android folder structure with WhatsApp Backup](imgs/android_structure_backup.png)
#### Extracting
Simply invoke the following command from shell.
```sh
wtsexporter -a -k key -b msgstore.db.crypt14
```
## Working with iPhone
Do an iPhone Backup with iTunes first.
### Encrypted iPhone Backup
**If you are working on unencrypted iPhone backup, skip this**
If you want to work on an encrypted iPhone Backup, you should install iphone_backup_decrypt from [KnugiHK/iphone_backup_decrypt](https://github.com/KnugiHK/iphone_backup_decrypt) before you run the extract_iphone_media.py.
```sh
pip install git+https://github.com/KnugiHK/iphone_backup_decrypt
```
### Extracting
Simply invoke the following command from shell, remember to replace the username and device id correspondingly in the command.
```sh
wtsexporter -i -b "C:\Users\[Username]\AppData\Roaming\Apple Computer\MobileSync\Backup\[device id]"
```
## Results
After extracting, you will get these:
#### Private Message
![Private Message](pm.png)
![Private Message](imgs/pm.png)
#### Group Message
![Group Message](group.png)
![Group Message](imgs/group.png)
# Encrypted iPhone Backup
To do
## More options
Invoke the wtsexporter with --help option will show you all options available.
```sh
> wtsexporter --help
Usage: wtsexporter [options]
Options:
--version show program's version number and exit
-h, --help show this help message and exit
-a, --android Define the target as Android
-i, --iphone Define the target as iPhone
-w WA, --wa=WA Path to contact database
-m MEDIA, --media=MEDIA
Path to WhatsApp media folder
-b BACKUP, --backup=BACKUP
Path to Android (must be used together with -k)/iPhone
WhatsApp backup
-o OUTPUT, --output=OUTPUT
Output to specific directory
-j, --json Save the result to a single JSON file
-d DB, --db=DB Path to database file
-k KEY, --key=KEY Path to key file
-t TEMPLATE, --template=TEMPLATE
Path to custom HTML template
```
# To do
1. Convert ```\r\n``` to ```<br>```
2. Reply in iPhone
3. The CSS for metadata (e.g. {Message Deleted})
4. Handle encrypted iPhone Backup
1. Reply in iPhone
# Copyright
This is a MIT licensed project.

View File

@@ -0,0 +1 @@
__version__ = "0.7.0"

View File

@@ -0,0 +1,163 @@
from .__init__ import __version__
from Whatsapp_Chat_Exporter import extract, extract_iphone
from Whatsapp_Chat_Exporter import extract_iphone_media
from optparse import OptionParser
import os
import sqlite3
import shutil
import json
def main():
parser = OptionParser(version=f"Whatsapp Chat Exporter: {__version__}")
parser.add_option(
'-a',
'--android',
dest='android',
default=False,
action='store_true',
help="Define the target as Android")
parser.add_option(
'-i',
'--iphone',
dest='iphone',
default=False,
action='store_true',
help="Define the target as iPhone")
parser.add_option(
"-w",
"--wa",
dest="wa",
default=None,
help="Path to contact database")
parser.add_option(
"-m",
"--media",
dest="media",
default=None,
help="Path to WhatsApp media folder")
parser.add_option(
"-b",
"--backup",
dest="backup",
default=None,
help="Path to Android (must be used together "
"with -k)/iPhone WhatsApp backup")
parser.add_option(
"-o",
"--output",
dest="output",
default="result",
help="Output to specific directory")
parser.add_option(
'-j',
'--json',
dest='json',
default=False,
action='store_true',
help="Save the result to a single JSON file")
parser.add_option(
'-d',
'--db',
dest='db',
default=None,
help="Path to database file")
parser.add_option(
'-k',
'--key',
dest='key',
default=None,
help="Path to key file"
)
parser.add_option(
"-t",
"--template",
dest="template",
default=None,
help="Path to custom HTML template")
(options, args) = parser.parse_args()
if options.android and options.iphone:
print("You must define only one device type.")
exit()
if not options.android and not options.iphone:
print("You must define the device type.")
exit()
data = {}
if options.android:
contacts = extract.contacts
messages = extract.messages
media = extract.media
vcard = extract.vcard
create_html = extract.create_html
if options.db is None:
msg_db = "msgstore.db"
else:
msg_db = options.db
if options.key is not None:
if options.backup is None:
print("You must specify the backup file with -b")
return False
print("Decryption key specified, decrypting WhatsApp backup...")
key = open(options.key, "rb").read()
db = open(options.backup, "rb").read()
is_crypt14 = False if "crypt12" in options.backup else True
if not extract.decrypt_backup(db, key, msg_db, is_crypt14):
print("Dependencies of decrypt_backup are not "
"present. For details, see README.md")
return False
if options.wa is None:
contact_db = "wa.db"
else:
contact_db = options.wa
if options.media is None:
options.media = "WhatsApp"
if len(args) == 1:
msg_db = args[0]
if os.path.isfile(contact_db):
with sqlite3.connect(contact_db) as db:
contacts(db, data)
elif options.iphone:
messages = extract_iphone.messages
media = extract_iphone.media
vcard = extract_iphone.vcard
create_html = extract_iphone.create_html
if options.backup is not None:
extract_iphone_media.extract_media(options.backup)
if options.db is None:
msg_db = "7c7fba66680ef796b916b067077cc246adacf01d"
else:
msg_db = options.db
if options.wa is None:
contact_db = "ContactsV2.sqlite"
else:
contact_db = options.wa
if options.media is None:
options.media = "Message"
if len(args) == 1:
msg_db = args[0]
if os.path.isfile(msg_db):
with sqlite3.connect(msg_db) as db:
messages(db, data)
media(db, data, options.media)
vcard(db, data)
create_html(data, options.output, options.template)
if not os.path.isdir(f"{options.output}/{options.media}"):
shutil.move(options.media, f"{options.output}/")
if options.json:
with open("result.json", "w") as f:
data = json.dumps(data)
print(f"\nWriting JSON file...({int(len(data)/1024/1024)}MB)")
f.write(data)
else:
print()
print("Everything is done!")

View File

@@ -7,8 +7,23 @@ import os
import requests
import shutil
import re
import pkgutil
from pathlib import Path
from bleach import clean as sanitize
from markupsafe import Markup
from datetime import datetime
from mimetypes import MimeTypes
try:
import zlib
from Crypto.Cipher import AES
except ModuleNotFoundError:
support_backup = False
else:
support_backup = True
def sanitize_except(html):
return Markup(sanitize(html, tags=["br"]))
def determine_day(last, current):
@@ -20,6 +35,39 @@ def determine_day(last, current):
return current
def decrypt_backup(database, key, output, crypt14=True):
if not support_backup:
return False
if len(key) != 158:
raise ValueError("The key file must be 158 bytes")
t1 = key[30:62]
if crypt14:
if len(database) < 191:
raise ValueError("The crypt14 file must be at least 191 bytes")
t2 = database[15:47]
iv = database[67:83]
db_ciphertext = database[191:]
else:
if len(database) < 67:
raise ValueError("The crypt12 file must be at least 67 bytes")
t2 = database[3:35]
iv = database[51:67]
db_ciphertext = database[67:-20]
if t1 != t2:
raise ValueError("The signature of key file and backup file mismatch")
main_key = key[126:]
cipher = AES.new(main_key, AES.MODE_GCM, iv)
db_compressed = cipher.decrypt(db_ciphertext)
db = zlib.decompress(db_compressed)
if db[0:6].upper() == b"SQLITE":
with open(output, "wb") as f:
f.write(db)
return True
else:
raise ValueError("The plaintext is not a SQLite database. Did you use the key to encrypt something...")
def contacts(db, data):
# Get contacts
c = db.cursor()
@@ -71,7 +119,9 @@ def messages(db, data):
"timestamp": content[3]/1000,
"time": datetime.fromtimestamp(content[3]/1000).strftime("%H:%M"),
"media": False,
"key_id": content[13]
"key_id": content[13],
"meta": False,
"data": None
}
if "-" in content[0] and content[2] == 0:
name = None
@@ -106,8 +156,9 @@ def messages(db, data):
try:
int(content[4])
except ValueError:
msg = "{The group name changed to "f"{content[4]}"" }"
msg = f"The group name changed to {content[4]}"
data[content[0]]["messages"][content[1]]["data"] = msg
data[content[0]]["messages"][content[1]]["meta"] = True
else:
del data[content[0]]["messages"][content[1]]
else:
@@ -126,15 +177,16 @@ def messages(db, data):
name_left = data[content[8]]["name"]
else:
name_left = content[8].split('@')[0]
msg = "{"f"{name_left}"f" added {name_right or 'You'}""}"
msg = f"{name_left} added {name_right or 'You'}"
else:
msg = "{"f"Added {name_right or 'You'}""}"
msg = f"Added {name_right or 'You'}"
elif b"\xac\xed\x00\x05\x74\x00" in thumb_image:
# Changed number
original = content[8].split('@')[0]
changed = thumb_image[7:].decode().split('@')[0]
msg = "{"f"{original} changed to {changed}""}"
msg = f"{original} changed to {changed}"
data[content[0]]["messages"][content[1]]["data"] = msg
data[content[0]]["messages"][content[1]]["meta"] = True
else:
if content[4] is None:
del data[content[0]]["messages"][content[1]]
@@ -146,20 +198,34 @@ def messages(db, data):
else:
if content[2] == 1:
if content[5] == 5 and content[6] == 7:
msg = "{Message deleted}"
msg = "Message deleted"
data[content[0]]["messages"][content[1]]["meta"] = True
else:
if content[9] == "5":
msg = "{ Location shared: "f"{content[10], content[11]}"" }"
msg = f"Location shared: {content[10], content[11]}"
data[content[0]]["messages"][content[1]]["meta"] = True
else:
msg = content[4]
if msg is not None:
if "\r\n" in msg:
msg = msg.replace("\r\n", "<br>")
if "\n" in msg:
msg = msg.replace("\n", "<br>")
else:
if content[5] == 0 and content[6] == 7:
msg = "{Message deleted}"
msg = "Message deleted"
data[content[0]]["messages"][content[1]]["meta"] = True
else:
if content[9] == "5":
msg = "{ Location shared: "f"{content[10], content[11]}"" }"
msg = f"Location shared: {content[10], content[11]}"
data[content[0]]["messages"][content[1]]["meta"] = True
else:
msg = content[4]
if msg is not None:
if "\r\n" in msg:
msg = msg.replace("\r\n", "<br>")
if "\n" in msg:
msg = msg.replace("\n", "<br>")
data[content[0]]["messages"][content[1]]["data"] = msg
@@ -167,8 +233,7 @@ def messages(db, data):
if i % 1000 == 0:
print(f"Gathering messages...({i}/{total_row_number})", end="\r")
content = c.fetchone()
print(
f"Gathering messages...({total_row_number}/{total_row_number})", end="\r")
print(f"Gathering messages...({total_row_number}/{total_row_number})", end="\r")
def media(db, data, media_folder):
@@ -214,8 +279,9 @@ def media(db, data, media_folder):
# data[content[0]]["messages"][content[1]]["media"] = True
# data[content[0]]["messages"][content[1]]["mime"] = "media"
# else:
data[content[0]]["messages"][content[1]]["data"] = "{The media is missing}"
data[content[0]]["messages"][content[1]]["data"] = "The media is missing"
data[content[0]]["messages"][content[1]]["mime"] = "media"
data[content[0]]["messages"][content[1]]["meta"] = True
i += 1
if i % 100 == 0:
print(f"Gathering media...({i}/{total_row_number})", end="\r")
@@ -238,27 +304,34 @@ def vcard(db, data):
total_row_number = len(rows)
print(f"\nGathering vCards...(0/{total_row_number})", end="\r")
base = "WhatsApp/vCards"
if not os.path.isdir(base):
Path(base).mkdir(parents=True, exist_ok=True)
for index, row in enumerate(rows):
if not os.path.isdir(base):
os.mkdir(base)
file_name = "".join(x for x in row[3] if x.isalnum())
file_path = f"{base}/{file_name}.vcf"
if not os.path.isfile(file_path):
with open(file_path, "w", encoding="utf-8") as f:
f.write(row[2])
data[row[1]]["messages"][row[0]]["data"] = row[3] + \
"{ The vCard file cannot be displayed here, however it " \
"should be located at " + file_path + "}"
"The vCard file cannot be displayed here, " \
f"however it should be located at {file_path}"
data[row[1]]["messages"][row[0]]["mime"] = "text/x-vcard"
data[row[1]]["messages"][row[0]]["meta"] = True
print(f"Gathering vCards...({index + 1}/{total_row_number})", end="\r")
def create_html(data, output_folder):
templateLoader = jinja2.FileSystemLoader(searchpath="./")
def create_html(data, output_folder, template=None):
if template is None:
template_dir = os.path.dirname(__file__)
template_file = "whatsapp.html"
else:
template_dir = os.path.dirname(template)
template_file = os.path.basename(template)
templateLoader = jinja2.FileSystemLoader(searchpath=template_dir)
templateEnv = jinja2.Environment(loader=templateLoader)
templateEnv.globals.update(determine_day=determine_day)
TEMPLATE_FILE = "whatsapp.html"
template = templateEnv.get_template(TEMPLATE_FILE)
templateEnv.filters['sanitize_except'] = sanitize_except
template = templateEnv.get_template(template_file)
total_row_number = len(data)
print(f"\nCreating HTML...(0/{total_row_number})", end="\r")

View File

@@ -6,12 +6,20 @@ import jinja2
import os
import requests
import shutil
import pkgutil
from pathlib import Path
from bleach import clean as sanitize
from markupsafe import Markup
from datetime import datetime
from mimetypes import MimeTypes
APPLE_TIME = datetime.timestamp(datetime(2001, 1, 1))
def sanitize_except(html):
return Markup(sanitize(html, tags=["br"]))
def determine_day(last, current):
last = datetime.fromtimestamp(last).date()
current = datetime.fromtimestamp(current).date()
@@ -61,7 +69,9 @@ def messages(db, data):
"time": datetime.fromtimestamp(ts).strftime("%H:%M"),
"media": False,
"reply": None,
"caption": None
"caption": None,
"meta": False,
"data": None
}
if "-" in content[0] and content[2] == 0:
name = None
@@ -86,8 +96,9 @@ def messages(db, data):
try:
int(content[4])
except ValueError:
msg = "{The group name changed to "f"{content[4]}"" }"
msg = f"The group name changed to {content[4]}"
data[content[0]]["messages"][content[1]]["data"] = msg
data[content[0]]["messages"][content[1]]["meta"] = True
else:
del data[content[0]]["messages"][content[1]]
else:
@@ -98,14 +109,26 @@ def messages(db, data):
# real message
if content[2] == 1:
if content[5] == 14:
msg = "{Message deleted}"
msg = "Message deleted"
data[content[0]]["messages"][content[1]]["meta"] = True
else:
msg = content[4]
if msg is not None:
if "\r\n" in msg:
msg = msg.replace("\r\n", "<br>")
if "\n" in msg:
msg = msg.replace("\n", "<br>")
else:
if content[5] == 14:
msg = "{Message deleted}"
msg = "Message deleted"
data[content[0]]["messages"][content[1]]["meta"] = True
else:
msg = content[4]
if msg is not None:
if "\r\n" in msg:
msg = msg.replace("\r\n", "<br>")
if "\n" in msg:
msg = msg.replace("\n", "<br>")
data[content[0]]["messages"][content[1]]["data"] = msg
i += 1
if i % 1000 == 0:
@@ -137,7 +160,7 @@ def media(db, data, media_folder):
content = c.fetchone()
mime = MimeTypes()
while content is not None:
file_path = f"Message/{content[2]}"
file_path = f"{media_folder}/{content[2]}"
data[content[0]]["messages"][content[1]]["media"] = True
if os.path.isfile(file_path):
@@ -160,8 +183,9 @@ def media(db, data, media_folder):
# data[content[0]]["messages"][content[1]]["data"] = "{The media is missing}"
# data[content[0]]["messages"][content[1]]["mime"] = "media"
# else:
data[content[0]]["messages"][content[1]]["data"] = "{The media is missing}"
data[content[0]]["messages"][content[1]]["data"] = "The media is missing"
data[content[0]]["messages"][content[1]]["mime"] = "media"
data[content[0]]["messages"][content[1]]["meta"] = True
if content[6] is not None:
data[content[0]]["messages"][content[1]]["caption"] = content[6]
i += 1
@@ -189,28 +213,35 @@ def vcard(db, data):
total_row_number = len(rows)
print(f"\nGathering vCards...(0/{total_row_number})", end="\r")
base = "Message/vCards"
if not os.path.isdir(base):
Path(base).mkdir(parents=True, exist_ok=True)
for index, row in enumerate(rows):
if not os.path.isdir(base):
os.mkdir(base)
file_name = "".join(x for x in row[3] if x.isalnum())
file_path = f"{base}/{file_name[:200]}.vcf"
if not os.path.isfile(file_path):
with open(file_path, "w", encoding="utf-8") as f:
f.write(row[4])
data[row[2]]["messages"][row[1]]["data"] = row[3] + \
"{ The vCard file cannot be displayed here, however it " \
"should be located at " + file_path + "}"
"The vCard file cannot be displayed here, " \
f"however it should be located at {file_path}"
data[row[2]]["messages"][row[1]]["mime"] = "text/x-vcard"
data[row[2]]["messages"][row[1]]["media"] = True
data[row[2]]["messages"][row[1]]["meta"] = True
print(f"Gathering vCards...({index + 1}/{total_row_number})", end="\r")
def create_html(data, output_folder):
templateLoader = jinja2.FileSystemLoader(searchpath="./")
def create_html(data, output_folder, template=None):
if template is None:
template_dir = os.path.dirname(__file__)
template_file = "whatsapp.html"
else:
template_dir = os.path.dirname(template)
template_file = os.path.basename(template)
templateLoader = jinja2.FileSystemLoader(searchpath=template_dir)
templateEnv = jinja2.Environment(loader=templateLoader)
templateEnv.globals.update(determine_day=determine_day)
TEMPLATE_FILE = "whatsapp.html"
template = templateEnv.get_template(TEMPLATE_FILE)
templateEnv.filters['sanitize_except'] = sanitize_except
template = templateEnv.get_template(template_file)
total_row_number = len(data)
print(f"\nCreating HTML...(0/{total_row_number})", end="\r")

View File

@@ -0,0 +1,133 @@
#!/usr/bin/python3
import shutil
import sqlite3
import os
import getpass
try:
from iphone_backup_decrypt import EncryptedBackup, RelativePath
except ModuleNotFoundError:
support_encrypted = False
else:
support_encrypted = True
def extract_encrypted(base_dir, password):
backup = EncryptedBackup(backup_directory=base_dir, passphrase=password)
print("Decrypting WhatsApp database...")
backup.extract_file(relative_path=RelativePath.WHATSAPP_MESSAGES,
output_filename="7c7fba66680ef796b916b067077cc246adacf01d")
backup.extract_file(relative_path=RelativePath.WHATSAPP_CONTACTS,
output_filename="ContactsV2.sqlite")
data = backup.execute_sql("""SELECT count()
FROM Files
WHERE relativePath
LIKE 'Message/Media/%'"""
)
total_row_number = data[0][0]
print(f"Gathering media...(0/{total_row_number})", end="\r")
data = backup.execute_sql("""SELECT fileID,
relativePath,
flags,
file
FROM Files
WHERE relativePath
LIKE 'Message/Media/%'"""
)
if not os.path.isdir("Message"):
os.mkdir("Message")
if not os.path.isdir("Message/Media"):
os.mkdir("Message/Media")
i = 0
for row in data:
destination = row[1]
hashes = row[0]
folder = hashes[:2]
flags = row[2]
file = row[3]
if flags == 2:
try:
os.mkdir(destination)
except FileExistsError:
pass
elif flags == 1:
decrypted = backup.decrypt_inner_file(file_id=hashes, file_bplist=file)
with open(destination, "wb") as f:
f.write(decrypted)
i += 1
if i % 100 == 0:
print(f"Gathering media...({i}/{total_row_number})", end="\r")
print(f"Gathering media...({total_row_number}/{total_row_number})", end="\r")
def is_encrypted(base_dir):
with sqlite3.connect(f"{base_dir}/Manifest.db") as f:
c = f.cursor()
try:
c.execute("""SELECT count()
FROM Files
""")
except sqlite3.DatabaseError:
return True
else:
return False
def extract_media(base_dir):
if is_encrypted(base_dir):
if not support_encrypted:
print("You don't have the dependencies to handle encrypted backup.")
print("Read more on how to deal with encrypted backup:")
print("https://github.com/KnugiHK/Whatsapp-Chat-Exporter/blob/main/README.md#usage")
return False
password = getpass.getpass("Enter the password:")
extract_encrypted(base_dir, password)
else:
wts_db = os.path.join(base_dir, "7c/7c7fba66680ef796b916b067077cc246adacf01d")
if not os.path.isfile(wts_db):
print("WhatsApp database not found.")
exit()
else:
shutil.copyfile(wts_db, "7c7fba66680ef796b916b067077cc246adacf01d")
with sqlite3.connect(f"{base_dir}/Manifest.db") as manifest:
c = manifest.cursor()
c.execute("""SELECT count()
FROM Files
WHERE relativePath
LIKE 'Message/Media/%'""")
total_row_number = c.fetchone()[0]
print(f"Gathering media...(0/{total_row_number})", end="\r")
c.execute("""SELECT fileID,
relativePath,
flags
FROM Files
WHERE relativePath
LIKE 'Message/Media/%'""")
row = c.fetchone()
if not os.path.isdir("Message"):
os.mkdir("Message")
if not os.path.isdir("Message/Media"):
os.mkdir("Message/Media")
i = 0
while row is not None:
destination = row[1]
hashes = row[0]
folder = hashes[:2]
flags = row[2]
if flags == 2:
os.mkdir(destination)
elif flags == 1:
shutil.copyfile(f"{base_dir}/{folder}/{hashes}", destination)
i += 1
if i % 100 == 0:
print(f"Gathering media...({i}/{total_row_number})", end="\r")
row = c.fetchone()
print(f"Gathering media...({total_row_number}/{total_row_number})", end="\r")
if __name__ == "__main__":
from optparse import OptionParser
parser = OptionParser()
(_, args) = parser.parse_args()
base_dir = args[0]
extract_media(base_dir)

View File

@@ -1,158 +1,174 @@
<!DOCTYPE html>
<html>
<head>
<title>Whatsapp - {{ name }}</title>
<link rel="stylesheet" href="https://www.w3schools.com/w3css/4/w3.css">
<style>
@import url('https://fonts.googleapis.com/css2?family=Noto+Sans+HK:wght@300;400&display=swap');
html {
font-family: 'Noto Sans HK', sans-serif;
font-size: 12px;
scroll-behavior: smooth;
}
header {
position: fixed;
z-index: 20;
border-bottom: 2px solid #e3e6e7;
font-size: 2em;
font-weight: bolder;
background-color: white;
padding: 20px 0 20px 0;
}
footer {
border-top: 2px solid #e3e6e7;
font-size: 2em;
padding: 20px 0 20px 0;
}
article {
width:500px;
margin:100px auto;
z-index:10;
font-size: 15px;
word-wrap: break-word;
}
img, video {
max-width:100%;
}
a.anchor {
display: block;
position: relative;
top: -100px;
visibility: hidden;
}
div.reply{
font-size: 13px;
text-decoration: none;
}
</style>
</head>
<body>
<header class="w3-center w3-top">Chat history with {{ name }}</header>
<article class="w3-container">
<div class="table" style="width:100%">
{% set last = {'last': 946688461.001} %}
{% for msg in msgs -%}
<div class="w3-row" style="padding-bottom: 10px">
<a class="anchor" id="{{ msg.key_id }}"></a>
{% if determine_day(last.last, msg.timestamp) is not none %}
<div class="w3-center" style="color:#70777c;padding: 10px 0 10px 0;">{{ determine_day(last.last, msg.timestamp) }}</div>
{% if last.update({'last': msg.timestamp}) %}{% endif %}
{% endif %}
{% if msg.from_me == true %}
<div class="w3-row">
<div style="float: left; color:#70777c;">{{ msg.time }}</div>
<div style="padding-left: 10px; text-align: right; color: #3892da;">You</div>
</div>
<div class="w3-row">
<div class="w3-col m10 l10">
<div style="text-align: right;">
{% if msg.reply is not none %}
<div class="reply">
<span style="color: #70777a;">Replying to </span>
<a href="#{{msg.reply}}" style="color: #168acc;">"{{ msg.quoted_data or 'media' }}"</a>
</div>
{% endif %}
{% if msg.media == false %}
{% filter escape %}{{ msg.data or "{This message is not supported yet}" | replace('\n', '<br>') }}{% endfilter %}
{% else %}
{% if "image/" in msg.mime %}
<a href="{{ msg.data }}"><img src="{{ msg.data }}" /></a>
{% elif "audio/" in msg.mime %}
<audio controls="controls" autobuffer="autobuffer">
<source src="{{ msg.data }}" />
</audio>
{% elif "video/" in msg.mime %}
<video controls="controls" autobuffer="autobuffer">
<source src="{{ msg.data }}" />
</video>
{% elif "/" in msg.mime %}
{The file cannot be displayed here, however it should be located at {{ msg.data }}}
{% else %}
{% filter escape %}{{ msg.data }}{% endfilter %}
{% endif %}
{% if msg.caption is not none %}
<br>
{{ msg.caption }}
{% endif %}
{% endif %}
</div>
</div>
<div class="w3-col m2 l2" style="padding-left: 10px"><img src="{{ my_avatar }}" onerror="this.style.display='none'"></div>
</div>
{% else %}
<div class="w3-row">
<div style="padding-right: 10px; float: left; color: #3892da;">
{% if msg.sender is not none %}
{{ msg.sender }}
{% else %}
{{ name }}
{% endif %}
</div>
<div style="text-align: right; color:#70777c;">{{ msg.time }}</div>
</div>
<div class="w3-row">
<div class="w3-col m2 l2"><img src="{{ their_avatar }}" onerror="this.style.display='none'"></div>
<div class="w3-col m10 l10">
<div style="text-align: left;">
{% if msg.reply is not none %}
<div class="reply">
<span style="color: #70777a;">Replying to </span>
<a href="#{{msg.reply}}" style="color: #168acc;">"{{ msg.quoted_data or 'media' }}"</a>
</div>
{% endif %}
{% if msg.media == false %}
{% filter escape %}{{ msg.data or "{This message is not supported yet}" }}{% endfilter %}
{% else %}
{% if "image/" in msg.mime %}
<a href="{{ msg.data }}"><img src="{{ msg.data }}" /></a>
{% elif "audio/" in msg.mime %}
<audio controls="controls" autobuffer="autobuffer">
<source src="{{ msg.data }}" />
</audio>
{% elif "video/" in msg.mime %}
<video controls="controls" autobuffer="autobuffer">
<source src="{{ msg.data }}" />
</video>
{% elif "/" in msg.mime %}
{The file cannot be displayed here, however it should be located at {{ msg.data }}}
{% else %}
{% filter escape %}{{ msg.data }}{% endfilter %}
{% endif %}
{% if msg.caption is not none %}
<br>
{{ msg.caption }}
{% endif %}
{% endif %}
</div>
</div>
</div>
{% endif %}
</div>
{% endfor %}
</div>
</article>
<footer class="w3-center">
End of history
</footer>
</body>
<!DOCTYPE html>
<html>
<head>
<title>Whatsapp - {{ name }}</title>
<link rel="stylesheet" href="https://www.w3schools.com/w3css/4/w3.css">
<style>
@import url('https://fonts.googleapis.com/css2?family=Noto+Sans+HK:wght@300;400&display=swap');
html {
font-family: 'Noto Sans HK', sans-serif;
font-size: 12px;
scroll-behavior: smooth;
}
header {
position: fixed;
z-index: 20;
border-bottom: 2px solid #e3e6e7;
font-size: 2em;
font-weight: bolder;
background-color: white;
padding: 20px 0 20px 0;
}
footer {
border-top: 2px solid #e3e6e7;
font-size: 2em;
padding: 20px 0 20px 0;
}
article {
width:500px;
margin:100px auto;
z-index:10;
font-size: 15px;
word-wrap: break-word;
}
img, video {
max-width:100%;
}
a.anchor {
display: block;
position: relative;
top: -100px;
visibility: hidden;
}
div.reply{
font-size: 13px;
text-decoration: none;
}
</style>
</head>
<body>
<header class="w3-center w3-top">Chat history with {{ name }}</header>
<article class="w3-container">
<div class="table" style="width:100%">
{% set last = {'last': 946688461.001} %}
{% for msg in msgs -%}
<div class="w3-row" style="padding-bottom: 10px">
<a class="anchor" id="{{ msg.key_id }}"></a>
{% if determine_day(last.last, msg.timestamp) is not none %}
<div class="w3-center" style="color:#70777c;padding: 10px 0 10px 0;">{{ determine_day(last.last, msg.timestamp) }}</div>
{% if last.update({'last': msg.timestamp}) %}{% endif %}
{% endif %}
{% if msg.from_me == true %}
<div class="w3-row">
<div style="float: left; color:#70777c;">{{ msg.time }}</div>
<div style="padding-left: 10px; text-align: right; color: #3892da;">You</div>
</div>
<div class="w3-row">
<div class="w3-col m10 l10">
<div style="text-align: right;">
{% if msg.reply is not none %}
<div class="reply">
<span style="color: #70777a;">Replying to </span>
<a href="#{{msg.reply}}" style="color: #168acc;">"{{ msg.quoted_data or 'media' }}"</a>
</div>
{% endif %}
{% if msg.meta == true or msg.media == false and msg.data is none %}
<div style="text-align: center;" class="w3-panel w3-border-blue w3-pale-blue w3-rightbar w3-leftbar">
<p>{{ msg.data or 'This message is not supported' }}</p>
</div>
{% else %}
{% if msg.media == false %}
{{ msg.data | sanitize_except() }}
{% else %}
{% if "image/" in msg.mime %}
<a href="{{ msg.data }}"><img src="{{ msg.data }}" /></a>
{% elif "audio/" in msg.mime %}
<audio controls="controls" autobuffer="autobuffer">
<source src="{{ msg.data }}" />
</audio>
{% elif "video/" in msg.mime %}
<video controls="controls" autobuffer="autobuffer">
<source src="{{ msg.data }}" />
</video>
{% elif "/" in msg.mime %}
<div style="text-align: center;" class="w3-panel w3-border-blue w3-pale-blue w3-rightbar w3-leftbar">
<p>The file cannot be displayed here, however it should be located at {{ msg.data }}</p>
</div>
{% else %}
{% filter escape %}{{ msg.data }}{% endfilter %}
{% endif %}
{% if msg.caption is not none %}
<br>
{{ msg.caption }}
{% endif %}
{% endif %}
{% endif %}
</div>
</div>
<div class="w3-col m2 l2" style="padding-left: 10px"><img src="{{ my_avatar }}" onerror="this.style.display='none'"></div>
</div>
{% else %}
<div class="w3-row">
<div style="padding-right: 10px; float: left; color: #3892da;">
{% if msg.sender is not none %}
{{ msg.sender }}
{% else %}
{{ name }}
{% endif %}
</div>
<div style="text-align: right; color:#70777c;">{{ msg.time }}</div>
</div>
<div class="w3-row">
<div class="w3-col m2 l2"><img src="{{ their_avatar }}" onerror="this.style.display='none'"></div>
<div class="w3-col m10 l10">
<div style="text-align: left;">
{% if msg.reply is not none %}
<div class="reply">
<span style="color: #70777a;">Replying to </span>
<a href="#{{msg.reply}}" style="color: #168acc;">"{{ msg.quoted_data or 'media' }}"</a>
</div>
{% endif %}
{% if msg.meta == true or msg.media == false and msg.data is none %}
<div style="text-align: center;" class="w3-panel w3-border-blue w3-pale-blue w3-rightbar w3-leftbar">
<p>{{ msg.data or 'This message is not supported' }}</p>
</div>
{% else %}
{% if msg.media == false %}
{{ msg.data | sanitize_except() }}
{% else %}
{% if "image/" in msg.mime %}
<a href="{{ msg.data }}"><img src="{{ msg.data }}" /></a>
{% elif "audio/" in msg.mime %}
<audio controls="controls" autobuffer="autobuffer">
<source src="{{ msg.data }}" />
</audio>
{% elif "video/" in msg.mime %}
<video controls="controls" autobuffer="autobuffer">
<source src="{{ msg.data }}" />
</video>
{% elif "/" in msg.mime %}
<div style="text-align: center;" class="w3-panel w3-border-blue w3-pale-blue w3-rightbar w3-leftbar">
<p>The file cannot be displayed here, however it should be located at {{ msg.data }}</p>
</div>
{% else %}
{% filter escape %}{{ msg.data }}{% endfilter %}
{% endif %}
{% if msg.caption is not none %}
<br>
{{ msg.caption }}
{% endif %}
{% endif %}
{% endif %}
</div>
</div>
</div>
{% endif %}
</div>
{% endfor %}
</div>
</article>
<footer class="w3-center">
End of history
</footer>
</body>
</html>

1
_config.yml Normal file
View File

@@ -0,0 +1 @@
theme: jekyll-theme-cayman

View File

@@ -1,50 +0,0 @@
#!/usr/bin/python3
import shutil
import sqlite3
import os
def extract_media(base_dir):
with sqlite3.connect(f"{base_dir}/Manifest.db") as manifest:
c = manifest.cursor()
c.execute("""SELECT count()
FROM Files
WHERE relativePath
LIKE 'Message/Media/%'""")
total_row_number = c.fetchone()[0]
print(f"Gathering media...(0/{total_row_number})", end="\r")
c.execute("""SELECT fileID,
relativePath,
flags
FROM Files
WHERE relativePath
LIKE 'Message/Media/%'""")
row = c.fetchone()
if not os.path.isdir("Message"):
os.mkdir("Message")
if not os.path.isdir("Message/Media"):
os.mkdir("Message/Media")
i = 0
while row is not None:
destination = row[1]
hashes = row[0]
folder = hashes[:2]
flags = row[2]
if flags == 2:
os.mkdir(destination)
elif flags == 1:
shutil.copyfile(f"{base_dir}/{folder}/{hashes}", destination)
i += 1
if i % 100 == 0:
print(f"Gathering media...({i}/{total_row_number})", end="\r")
row = c.fetchone()
print(f"Gathering media...({total_row_number}/{total_row_number})", end="\r")
if __name__ == "__main__":
from optparse import OptionParser
parser = OptionParser()
(_, args) = parser.parse_args()
base_dir = args[0]
extract_media(base_dir)

BIN
group.png

Binary file not shown.

Before

Width:  |  Height:  |  Size: 29 KiB

BIN
imgs/android_structure.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

BIN
imgs/group.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 36 KiB

BIN
imgs/pm.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 38 KiB

View File

Before

Width:  |  Height:  |  Size: 7.8 KiB

After

Width:  |  Height:  |  Size: 7.8 KiB

34
old_README.md Normal file
View File

@@ -0,0 +1,34 @@
# Whatsapp-Chat-Exporter
A Whatsapp database parser that will give you the history of your Whatsapp conversations in HTML and JSON
**If you plan to uninstall WhatsApp or delete your WhatsApp account, please make a backup of your WhatsApp database. You may want to use this exporter again on the same database in the future as the exporter develops**
# Usage
First, clone this repo, and copy all py and html files to a working directory if you want to do so.
```shell
git clone https://github.com/KnugiHK/Whatsapp-Chat-Exporter.git
```
Then, ready your WhatsApp database, place them in the root of working directory.
* For Android, it is called msgstore.db. If you want name of your contacts, get the contact database, which is called wa.db.
* For iPhone, it is called 7c7fba66680ef796b916b067077cc246adacf01d (YES, a hash).
Next, ready your media folder, place it in the root of working directory.
* For Android, copy the WhatsApp directory from your phone directly.
* For iPhone, run the extract_iphone_media.py, and you will get a folder called Message.
```
python extract_iphone_media.py "C:\Users\[Username]\AppData\Roaming\Apple Computer\MobileSync\Backup\[device id]"
```
And now, you should have something like this:
![Folder structure](imgs/structure.png)
Last, run the script regarding the type of phone.
```
python extract.py & :: Android
python extract_iphone.py & :: iPhone
```
And you will get these:
#### Private Message
![Private Message](imgs/pm.png)
#### Group Message
![Group Message](imgs/group.png)

BIN
pm.png

Binary file not shown.

Before

Width:  |  Height:  |  Size: 32 KiB

51
setup.py Normal file
View File

@@ -0,0 +1,51 @@
import setuptools
from re import search
with open("README.md", "r") as fh:
long_description = fh.read()
with open("Whatsapp_Chat_Exporter/__init__.py", encoding="utf8") as f:
version = search(r'__version__ = "(.*?)"', f.read()).group(1)
setuptools.setup(
name="whatsapp-chat-exporter",
version=version,
author="KnugiHK",
author_email="info@knugi.com",
description="A Whatsapp database parser that will give you the "
"history of your Whatsapp conversations in HTML and JSON.",
long_description=long_description,
long_description_content_type="text/markdown",
url="https://github.com/KnugiHK/Whatsapp-Chat-Exporter",
packages=setuptools.find_packages(),
package_data={
'': ['whatsapp.html']
},
classifiers=[
"Programming Language :: Python :: 3 :: Only",
"Programming Language :: Python :: 3.7",
"Programming Language :: Python :: 3.8",
"Programming Language :: Python :: 3.9",
"License :: OSI Approved :: MIT License",
"Operating System :: OS Independent",
"Development Status :: 4 - Beta",
"Environment :: Console",
"Intended Audience :: End Users/Desktop",
"Topic :: Communications :: Chat",
"Topic :: Utilities",
"Topic :: Database"
],
python_requires='>=3.7',
install_requires=[
'jinja2',
'bleach'
],
extras_require={
'android_backup': ["pycryptodome"]
},
entry_points={
"console_scripts": [
"wtsexporter = Whatsapp_Chat_Exporter.__main__:main"
]
}
)