159 Commits
0.3 ... 0.9.1

Author SHA1 Message Date
KnugiHK
20d8e1384a Bump version 2023-05-17 00:08:52 +08:00
KnugiHK
6fd0e61b64 Update help test 2023-05-16 23:29:10 +08:00
KnugiHK
bbb47cd839 Bug fix 2023-05-16 23:25:12 +08:00
Knugi
c155064ae1 Update README.md 2023-05-16 13:38:16 +00:00
Knugi
d4efd919f9 Update python-publish.yml 2023-05-16 11:49:57 +00:00
Knugi
13d761286e Update compile-binary.yml 2023-05-16 11:48:25 +00:00
Knugi
a943808734 Update compile-binary.yml 2023-05-16 11:48:00 +00:00
KnugiHK
0495970c38 Bump version 2023-05-16 19:33:29 +08:00
KnugiHK
32b14dc392 Update README.md 2023-05-16 19:31:45 +08:00
KnugiHK
6bea8d07f4 Update setup.py 2023-05-05 14:03:40 +08:00
KnugiHK
69fdb61bae Merge branch 'dev' 2023-05-05 13:55:41 +08:00
KnugiHK
e1f160fc7c Update setup.py 2023-05-05 13:53:11 +08:00
KnugiHK
3b7e02ba31 Add update checking 2023-05-05 13:53:00 +08:00
KnugiHK
bdb7d80831 Merge branch 'dev' 2023-05-05 10:42:48 +08:00
Knugi
ea7e019adc Update README.md
Make the command compatible with both Linux and Windows
2023-04-25 05:41:27 +00:00
KnugiHK
c7a01bb9c0 Handle deleted message in new schema
Related to #39 and #9
2023-04-25 13:02:20 +08:00
KnugiHK
7c0b90d458 Another attempt to fix the previous bug 2023-04-09 02:38:26 +08:00
KnugiHK
84383e1d9d Revert "Bug fix on empty vcf contact name"
This reverts commit 06a1d34567.
2023-04-09 02:37:04 +08:00
KnugiHK
06a1d34567 Bug fix on empty vcf contact name 2023-04-09 02:23:50 +08:00
KnugiHK
b371587d65 Bug fix on ChatStore initialization 2023-04-09 02:05:17 +08:00
KnugiHK
3e7d7916a7 PEP8 2023-03-28 12:23:46 +08:00
Knugi
17997e840f Update README.md 2023-03-28 04:21:09 +00:00
KnugiHK
640acb3f86 Move some utility functions to a separated python file 2023-03-25 18:33:22 +08:00
KnugiHK
cdfaf69f7a Add an option to disable html output 2023-03-25 18:26:03 +08:00
KnugiHK
ee5f8b82be Add suggestion to CryptX 2023-03-25 12:11:00 +08:00
KnugiHK
b9fa36acb4 Raise error if the error is not expected 2023-03-25 11:55:18 +08:00
KnugiHK
fb88124d21 Merge old and new schema processing logic 2023-03-25 11:51:27 +08:00
KnugiHK
b5effbd512 Change autobuffer to preload for video and audio tag 2023-03-25 11:46:52 +08:00
KnugiHK
430a5eccb8 Merge branch 'pr/26' into dev 2023-03-25 10:37:47 +08:00
GoComputing
8f0511a6e2 Re-enabled the HTML generation 2023-03-24 19:43:27 +01:00
Knugi
45666d8878 Add Python 3.10 to classifiers 2023-02-26 07:43:36 +00:00
KnugiHK
2ba55719f1 Add #32 to common offset 2023-02-26 15:31:03 +08:00
KnugiHK
2d23052758 Improve argument parser 2023-02-13 16:31:35 +08:00
KnugiHK
10875060c9 Deprecate --iphone and replace with --ios 2023-02-13 16:15:54 +08:00
KnugiHK
0bb99d59e0 Prepare for size control of output file 2023-02-13 16:08:21 +08:00
KnugiHK
9178e5326b Transit from optparse to argparse 2023-02-13 12:53:20 +08:00
KnugiHK
26320413e8 Add offline availability of w3css 2023-02-13 12:23:43 +08:00
KnugiHK
a275a0f40c Why is this line not in last commit... 2023-02-13 00:27:55 +08:00
KnugiHK
4cb4ac3e7b Bug fix for sender name in group chat
#9
2023-02-13 00:25:31 +08:00
Knugi
4139cab00f Fix binary 2023-02-12 11:51:43 +00:00
Knugi
c1964bc2cd Prepare for standalone binary
https://github.com/KnugiHK/Whatsapp-Chat-Exporter/issues/29
2023-02-12 18:11:38 +08:00
KnugiHK
dab0493354 Revert "Prepare for standalone binary"
This reverts commit 4d6c80b561.
2023-02-12 18:11:00 +08:00
Knugi
e0c464c8d8 Update compile-binary.yml 2023-02-12 18:01:29 +08:00
KnugiHK
92d339d1c0 Implement standalone binary compilation 2023-02-12 18:01:29 +08:00
Knugi
4d6c80b561 Prepare for standalone binary
https://github.com/KnugiHK/Whatsapp-Chat-Exporter/issues/29
2023-02-12 18:01:29 +08:00
Knugi
d46a42a097 Update compile-binary.yml 2023-02-12 09:56:55 +00:00
KnugiHK
7cd259143a Implement standalone binary compilation 2023-02-12 17:54:32 +08:00
Knugi
726812a5f7 Prepare for standalone binary
https://github.com/KnugiHK/Whatsapp-Chat-Exporter/issues/29
2023-02-12 08:15:40 +00:00
KnugiHK
6fddc1c23a Merge branch 'main' into dev 2023-01-31 17:56:53 +08:00
KnugiHK
77ceaa25dd Bump version 2023-01-31 17:52:34 +08:00
KnugiHK
e09f18e2f2 Minor fix 2023-01-31 17:52:12 +08:00
KnugiHK
23114572bd Forgot to change the variable lol 2023-01-31 17:52:12 +08:00
KnugiHK
2f04b69f38 A more concrete way to determine database offset 2023-01-31 17:52:12 +08:00
KnugiHK
e7c246822b Link to the file intead of showing the path directly
Not tested
Ref: https://github.com/KnugiHK/Whatsapp-Chat-Exporter/issues/15
2023-01-31 17:52:12 +08:00
KnugiHK
2a215d024f Bug fix
Duplicated folder creation
https://github.com/KnugiHK/Whatsapp-Chat-Exporter/issues/14
2023-01-31 17:52:12 +08:00
KnugiHK
f267f53007 Remove unused dependencies 2023-01-31 17:52:12 +08:00
KnugiHK
3a30dfc800 Bump version 2023-01-31 17:52:12 +08:00
KnugiHK
9600da59ae Correct the default in #25 2023-01-31 17:25:59 +08:00
KnugiHK
26b58843fb Add message 2023-01-31 16:46:22 +08:00
KnugiHK
60575c7989 Implement #25
Copying media folder to the output directory will be the default starting from this commit.
2023-01-31 16:34:34 +08:00
KnugiHK
14b1cb7fde Minor fix 2023-01-30 18:53:01 +08:00
GoComputing
92b8903521 Fixed JSON export
Added serialization to the classes 'ChatStore' and 'Messages' so that they can be
JSON serialized.
2023-01-28 20:51:40 +01:00
KnugiHK
d3892a4e4f Fix caption part 2022-12-23 17:28:23 +08:00
KnugiHK
b37c13434e Merge branch 'dev' into message_table 2022-12-23 16:49:37 +08:00
KnugiHK
4b357d5ea9 Update the import of Crypt to latest one for message 2022-12-23 16:49:28 +08:00
KnugiHK
6407ba2136 Adopt the latest version 2022-12-21 21:45:20 +08:00
KnugiHK
f87108dadc Some left-over 2022-12-21 21:42:54 +08:00
KnugiHK
6ca7e81484 Support new WhatsApp database schema
https://github.com/KnugiHK/Whatsapp-Chat-Exporter/issues/9
2022-12-21 21:28:54 +08:00
KnugiHK
41d3659269 Prepare for porting 2022-12-21 20:16:37 +08:00
Knugi
580eaddb24 Delete old_README.md 2022-10-24 03:59:42 +00:00
Knugi
77b4b784d3 Update README.md 2022-10-24 03:59:24 +00:00
KnugiHK
d9a77e0eec Forgot to change the variable lol 2022-09-05 12:49:12 +08:00
KnugiHK
876729eb81 A more concrete way to determine database offset 2022-09-05 12:48:36 +08:00
KnugiHK
48f667d02b Implement exporting 64-digit crypt15 encryption key
https://github.com/KnugiHK/Whatsapp-Chat-Exporter/issues/20
2022-09-05 12:16:07 +08:00
KnugiHK
422ab2f784 Link to the file intead of showing the path directly
Not tested
Ref: https://github.com/KnugiHK/Whatsapp-Chat-Exporter/issues/15
2022-07-03 12:30:01 +08:00
KnugiHK
996ee65525 Bug fix
Duplicated folder creation
https://github.com/KnugiHK/Whatsapp-Chat-Exporter/issues/14
2022-05-25 18:28:07 +08:00
KnugiHK
042f6f9024 Remove unused dependencies 2022-05-23 20:12:50 +08:00
KnugiHK
507e88d9c3 Merge branch 'dev' 2022-05-09 18:31:03 +08:00
KnugiHK
60e1e7d3eb Bump version 2022-05-09 18:28:13 +08:00
Knugi
774fb6d781 Merge pull request #11 from asla9709/dev
Fixed bug where blank VCard media_name would crash the program.
2022-05-09 10:03:04 +00:00
Aakif Aslam
3ef3b02230 Fixed bug where blank VCard media_name would crash the program. 2022-04-24 18:01:32 -04:00
Knugi
07cc0f3571 Update README.md 2022-04-04 07:57:12 +00:00
KnugiHK
a1319eb835 Exclude default conversation from results 2022-04-01 17:17:43 +08:00
KnugiHK
8cbb0af43a Oh. I missed this change 2022-03-04 14:06:19 +08:00
KnugiHK
28c4a7b99f Prepare for new function 2022-03-04 14:03:15 +08:00
KnugiHK
e4c9d42927 Update __init__.py 2022-03-04 13:48:12 +08:00
KnugiHK
c274b6b1c0 Merge branch 'dev' 2022-03-04 13:47:54 +08:00
KnugiHK
eec739d7cf Add crypt15 dependency to android_backup 2022-03-04 13:47:17 +08:00
Knugi
3d7dca0682 Delete _config.yml 2022-02-25 08:57:31 +00:00
Knugi
24f7837171 Create python-publish.yml 2022-02-22 14:01:36 +00:00
KnugiHK
15201acbe6 Bump version 2022-02-22 21:53:13 +08:00
Knugi
6fd290efd8 typo 2022-02-22 13:52:06 +00:00
Knugi
691bfe31c8 Update README.md 2022-02-22 13:51:00 +00:00
KnugiHK
64eb2bcb9d Add some aliases 2022-02-22 21:42:58 +08:00
KnugiHK
1bc4a8c5b9 Create android_structure_backup_crypt15.png 2022-02-22 21:42:50 +08:00
Knugi
8a621827ff Update README.md 2022-02-22 13:42:16 +00:00
KnugiHK
227f438404 Add one more offset 2022-02-22 21:22:07 +08:00
KnugiHK
3e71817778 Make the brute-force more sensitive and bug fix 2022-02-22 21:19:19 +08:00
KnugiHK
08c5979eed Support Hex key for Crypt15 2022-02-22 18:59:04 +08:00
KnugiHK
0e6319eb4e Support crypt15 2022-02-22 18:33:54 +08:00
KnugiHK
734bb78cd8 Implement offset brute forcing
Not tested yet
2022-02-22 02:08:44 +08:00
KnugiHK
a522eb2034 PEP8 2022-01-17 13:01:10 +08:00
KnugiHK
9fe6a0d2a8 Refactoring 2022-01-17 12:06:15 +08:00
KnugiHK
c73eabe2a4 Clearer error message in decompressing decrypted backup 2022-01-17 10:46:02 +08:00
KnugiHK
1faf111e64 Handle case that the database file does not exist and clearer exit code 2021-12-30 11:59:30 +08:00
KnugiHK
9140c07feb Prevent PermissionError from raising 2021-12-28 20:04:40 +08:00
KnugiHK
abf4b20bc6 Merge branch 'main' of https://github.com/KnugiHK/Whatsapp-Chat-Exporter 2021-12-28 19:44:00 +08:00
KnugiHK
f2f6258960 Bump version number 2021-12-28 19:38:01 +08:00
Knugi
62af48c78e Update README.md 2021-12-28 11:34:23 +00:00
Knugi
c9158d202d Update To do 2021-12-28 11:27:29 +00:00
KnugiHK
fb5e4d5421 Merge branch 'dev' 2021-12-28 19:26:02 +08:00
Knugi
d85c91fbdc Update README.md 2021-11-18 03:09:03 +00:00
KnugiHK
0dddb63c5e Merge branch 'main' into dev 2021-08-13 20:28:49 +08:00
KnugiHK
dd36960ecb Implement CSS for metadata 2021-08-13 20:25:59 +08:00
Knugi
620a1bcdb7 Update README.md 2021-07-13 10:54:07 +00:00
Knugi
896a6d2ddd Update README.md 2021-07-13 10:53:41 +00:00
KnugiHK
4ee92e7efc Bug fix
The directory cannot be created if the parent directory is not present
2021-07-11 18:24:59 +08:00
KnugiHK
f91aac1e11 Implementing newline to <br> 2021-07-11 18:17:06 +08:00
KnugiHK
27a6ff98b3 Merge branch 'main' into dev 2021-07-11 11:05:41 +08:00
Knugi
1952c0835c Update README.md 2021-07-11 03:05:15 +00:00
KnugiHK
ab42cad166 Some PEP8 2021-07-10 22:01:04 +08:00
KnugiHK
3ed59ee051 Bug fix 2021-07-10 21:53:28 +08:00
KnugiHK
f9358ded14 Update README.md 2021-07-10 21:50:30 +08:00
KnugiHK
790f4ec5e0 Support custom template 2021-07-10 21:46:45 +08:00
KnugiHK
35ef4031fc Support crypt12 2021-07-10 21:28:49 +08:00
KnugiHK
b9f343cf2f Update README and setup.py 2021-07-10 21:11:49 +08:00
KnugiHK
18ee152688 Support crypt14 WhatsApp Backup 2021-07-10 21:08:52 +08:00
Knugi
620e89a185 Update README.md 2021-07-06 03:17:57 +00:00
KnugiHK
3ada8916f9 Merge branch 'main' of https://github.com/KnugiHK/Whatsapp-Chat-Exporter into main 2021-05-31 16:13:07 +08:00
Knugi
07ebb692e5 Create CNAME 2021-05-31 08:05:44 +00:00
Knugi
7255f0fe2b Set theme jekyll-theme-cayman 2021-05-31 08:04:21 +00:00
KnugiHK
684badb9a6 Update the string appear in wtsexporter --version 2021-05-31 11:09:47 +08:00
Knugi
e1221d9f59 Update README.md 2021-05-31 03:04:07 +00:00
KnugiHK
fc84e430ed Update setup.py 2021-05-31 10:59:29 +08:00
KnugiHK
0b0af518c3 Update setup.py 2021-05-31 10:54:35 +08:00
KnugiHK
7e84595074 Update setup.py 2021-05-31 10:49:48 +08:00
KnugiHK
1bdd5fe6df Update extract_iphone_media.py 2021-05-31 10:47:40 +08:00
KnugiHK
0ac7eecb47 Update version number for Pypi 2021-05-31 10:44:04 +08:00
KnugiHK
b0a469509c Update README.md 2021-05-31 10:43:49 +08:00
Knugi
cec68dd3a0 Update README.md 2021-05-12 06:11:24 +00:00
Knugi
46e12daa6a Update README.md 2021-05-12 05:51:27 +00:00
Knugi
4dd7f4e753 Remove a to-do task 2021-05-12 05:48:43 +00:00
Knugi
6cf6e50db9 Update README.md 2021-05-12 05:47:02 +00:00
KnugiHK
366d18b678 Create android_structure.png 2021-05-12 13:44:46 +08:00
Knugi
e8a8546a13 Update README.md 2021-05-12 05:44:32 +00:00
KnugiHK
baa733df5f Create old_README.md 2021-05-12 13:24:54 +08:00
KnugiHK
d9f38fc714 Create console script 2021-05-10 15:51:43 +08:00
KnugiHK
322281a8ec Prepare for publishing in PyPi 2021-05-10 15:51:30 +08:00
KnugiHK
3ee40ecda4 Move images to a folder 2021-05-10 15:49:41 +08:00
Knugi
f591c7a57f Update README.md 2021-05-10 05:48:25 +00:00
KnugiHK
e1e49261aa Merge branch 'main' of https://github.com/KnugiHK/Whatsapp-Chat-Exporter into main 2021-05-10 13:45:37 +08:00
KnugiHK
b58aaa8f73 Resize images 2021-05-10 13:45:30 +08:00
Knugi
c62f08340e Update README.md 2021-05-10 05:40:28 +00:00
Knugi
4fb759f974 Update README.md 2021-05-10 05:39:59 +00:00
KnugiHK
cb2e83721e Merge branch 'main' of https://github.com/KnugiHK/Whatsapp-Chat-Exporter into main 2021-05-10 13:36:56 +08:00
KnugiHK
be7e20317d Add support for encrypted iPhone backup 2021-05-10 13:33:13 +08:00
Knugi
cf2e69b594 Update README.md 2021-05-10 04:54:04 +00:00
Knugi
fb33a883e6 Update README.md 2021-05-10 04:47:52 +00:00
Knugi
58bc8634f7 Update README.md 2021-05-09 05:40:24 +00:00
24 changed files with 2295 additions and 616 deletions

79
.github/workflows/compile-binary.yml vendored Normal file
View File

@@ -0,0 +1,79 @@
name: Compile standalone binary
on:
release:
types: [published]
workflow_dispatch:
permissions:
contents: read
jobs:
linux:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v3
with:
python-version: '3.10'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install pycryptodome javaobj-py3 ordered-set zstandard nuitka
pip install .
- name: Build binary with Nuitka
run: |
python -m nuitka --onefile --include-data-file=./Whatsapp_Chat_Exporter/whatsapp.html=./Whatsapp_Chat_Exporter/whatsapp.html --follow-imports Whatsapp_Chat_Exporter/__main__.py
cp __main__.bin wtsexporter_linux_x64
- uses: actions/upload-artifact@v3
with:
name: binary-linux
path: |
./wtsexporter_linux_x64
windows:
runs-on: windows-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v3
with:
python-version: '3.10'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install pycryptodome javaobj-py3 ordered-set zstandard nuitka
pip install .
- name: Build binary with Nuitka
run: |
python -m nuitka --onefile --include-data-file=./Whatsapp_Chat_Exporter/whatsapp.html=./Whatsapp_Chat_Exporter/whatsapp.html --assume-yes-for-downloads --follow-imports Whatsapp_Chat_Exporter\__main__.py
copy __main__.exe wtsexporter_x64.exe
- uses: actions/upload-artifact@v3
with:
name: binary-windows
path: |
.\wtsexporter_x64.exe
macos:
runs-on: macos-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v3
with:
python-version: '3.10'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install pycryptodome javaobj-py3 ordered-set zstandard nuitka
pip install .
- name: Build binary with Nuitka
run: |
python -m nuitka --onefile --include-data-file=./Whatsapp_Chat_Exporter/whatsapp.html=./Whatsapp_Chat_Exporter/whatsapp.html --follow-imports Whatsapp_Chat_Exporter/__main__.py
cp __main__.bin wtsexporter_macos_x64
- uses: actions/upload-artifact@v3
with:
name: binary-macos
path: |
./wtsexporter_macos_x64

36
.github/workflows/python-publish.yml vendored Normal file
View File

@@ -0,0 +1,36 @@
# This workflow will upload a Python Package using Twine when a release is created
# For more information see: https://help.github.com/en/actions/language-and-framework-guides/using-python-with-github-actions#publishing-to-package-registries
# This workflow uses actions that are not certified by GitHub.
# They are provided by a third-party and are governed by
# separate terms of service, privacy policy, and support
# documentation.
name: Upload Python Package
on:
release:
types: [published]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.x'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install build
- name: Build package
run: python -m build
- name: Publish package
uses: pypa/gh-action-pypi-publish@27b31702a0e7fc50959f5ad993c78deac1bdfc29
with:
user: __token__
password: ${{ secrets.PYPI_API_TOKEN }}

1
CNAME Normal file
View File

@@ -0,0 +1 @@
wts.knugi.com

147
README.md
View File

@@ -1,45 +1,140 @@
# Whatsapp-Chat-Exporter
A Whatsapp database parser that will give you the history of your Whatsapp conversation in HTML and JSON
[![Latest in Pypi](https://img.shields.io/pypi/v/whatsapp-chat-exporter?label=Latest%20in%20Pypi)](https://pypi.org/project/whatsapp-chat-exporter/)
![License MIT](https://img.shields.io/pypi/l/whatsapp-chat-exporter)
[![Python](https://img.shields.io/pypi/pyversions/Whatsapp-Chat-Exporter)](https://pypi.org/project/Whatsapp-Chat-Exporter/)
A customizable Android and iPhone Whatsapp database parser that will give you the history of your Whatsapp conversations in HTML and JSON. Inspired by [Telegram Chat Export Tool](https://telegram.org/blog/export-and-more).
**If you plan to uninstall WhatsApp or delete your WhatsApp account, please make a backup of your WhatsApp database. You may want to use this exporter again on the same database in the future as the exporter develops**
# Usage
First, clone this repo, and copy all py and html files to a working directory if you want to do so.
**Usage in README may be removed in the future. Check the usage in [Wiki](https://github.com/KnugiHK/Whatsapp-Chat-Exporter/wiki)**.
**If you want to use the old release (< 0.5) of the exporter, please follow the [old usage guide](https://github.com/KnugiHK/Whatsapp-Chat-Exporter/wiki/Old-Usage#usage)**.
First, install the exporter by:
```shell
git clone https://github.com/KnugiHK/Whatsapp-Chat-Exporter.git
pip install whatsapp-chat-exporter
pip install whatsapp-chat-exporter[android_backup] :; # Optional, if you want it to support decrypting Android WhatsApp backup.
```
Then, ready your WhatsApp database, place them in the root of working directory.
* For Android, it is called msgstore.db. If you want name of your contacts, get the contact database, which is called wa.db.
* For iPhone, it is called 7c7fba66680ef796b916b067077cc246adacf01d (YES, a hash).
Then, create a working directory in somewhere you want
```shell
mkdir working_wts
cd working_wts
```
## Working with Android
### Unencrypted WhatsApp database
Extract the WhatsApp database with whatever means, one possible means is to use the [WhatsApp-Key-DB-Extractor](https://github.com/KnugiHK/WhatsApp-Key-DB-Extractor)
Next, ready your media folder, place it in the root of working directory.
* For Android, copy the WhatsApp directory from your phone directly.
* For iPhone, run the extract_iphone_media.py, and you will get a folder called Message. Please note that, this script does not support encrypted backup.
```
python extract_iphone_media.py "C:\Users\[Username]\AppData\Roaming\Apple Computer\MobileSync\Backup\[device id]"
```
And now, you should have something like this:
After you obtain your WhatsApp database, copy the WhatsApp database and media folder to the working directory. The database is called msgstore.db. If you also want the name of your contacts, get the contact database, which is called wa.db. And copy the WhatsApp (Media) directory from your phone directly.
![Folder structure](structure.png)
And now, you should have something like this in the working directory.
Last, run the script regarding the type of phone.
![Android folder structure](imgs/android_structure.png)
#### Extracting
Simply invoke the following command from shell.
```sh
wtsexporter -a
```
python extract.py & :: Android
python extract_iphone.py & :: iPhone
### Encrypted Android WhatsApp Backup
In order to support the decryption, install pycryptodome if it is not installed
```sh
pip install pycryptodome # Or
pip install whatsapp-chat-exporter["android_backup"] # install along with this software
```
And you will get these:
### Crypt15 is now the easiest way to decrypt a backup. If you have the 32 bytes hex key generated when you enable End-to-End encrypted backup, you can use it to decrypt the backup. If you do not have the 32 bytes hex key, you can still use the key file extracted just like extacting key file for Crypt12 and Crypt14 to decrypt the backup.
#### Crypt12 or Crypt14
Place the decryption key file (key) and the encrypted WhatsApp Backup (msgstore.db.crypt14) in the working directory. If you also want the name of your contacts, get the contact database, which is called wa.db. And copy the WhatsApp (Media) directory from your phone directly.
And now, you should have something like this in the working directory.
![Android folder structure with WhatsApp Backup](imgs/android_structure_backup.png)
#### Extracting
Simply invoke the following command from shell.
```sh
wtsexporter -a -k key -b msgstore.db.crypt14
```
#### Crypt15 (End-to-End Encrypted Backup)
To support Crypt15 backup, install javaobj-py3 if it is not installed
```sh
pip install javaobj-py3 # Or
pip install whatsapp-chat-exporter["crypt15"] # install along with this software
```
Place the encrypted WhatsApp Backup (msgstore.db.crypt15) in the working directory. If you also want the name of your contacts, get the contact database, which is called wa.db. And copy the WhatsApp (Media) directory from your phone directly.
If you do not have the 32 bytes hex key (64 hexdigits), place the decryption key file (encrypted_backup.key) extracted from Android. If you gave the 32 bytes hex key, simply put the key in the shell.
Now, you should have something like this in the working directory (if you do not have 32 bytes hex key).
![Android folder structure with WhatsApp Crypt15 Backup](imgs/android_structure_backup_crypt15.png)
##### Extracting
If you do not have 32 bytes hex key but have the key file available, simply invoke the following command from shell.
```sh
wtsexporter -a -k encrypted_backup.key -b msgstore.db.crypt15
```
If you have the 32 bytes hex key, simply put the hex key in the -k option and invoke the command from shell like this:
```sh
wtsexporter -a -k 432435053b5204b08e5c3823423399aa30ff061435ab89bc4e6713969cdaa5a8 -b msgstore.db.crypt15
```
## Working with iPhone
Do an iPhone Backup with iTunes first.
### Encrypted iPhone Backup
**If you are working on unencrypted iPhone backup, skip this**
If you want to work on an encrypted iPhone Backup, you should install iphone_backup_decrypt from [KnugiHK/iphone_backup_decrypt](https://github.com/KnugiHK/iphone_backup_decrypt) before you run the extract_iphone_media.py.
```sh
pip install git+https://github.com/KnugiHK/iphone_backup_decrypt
```
### Extracting
Simply invoke the following command from shell, remember to replace the username and device id correspondingly in the command.
```sh
wtsexporter -i -b "C:\Users\[Username]\AppData\Roaming\Apple Computer\MobileSync\Backup\[device id]"
```
## Results
After extracting, you will get these:
#### Private Message
![Private Message](pm.png)
![Private Message](imgs/pm.png)
#### Group Message
![Group Message](group.png)
![Group Message](imgs/group.png)
# Encrypted iPhone Backup
To do
## More options
Invoke the wtsexporter with --help option will show you all options available.
```sh
> wtsexporter --help
usage: wtsexporter [options]
options:
-h, --help show this help message and exit
-a, --android Define the target as Android
-i, --iphone, --ios Define the target as iPhone
-w WA, --wa WA Path to contact database (default: wa.db/ContactsV2.sqlite)
-m MEDIA, --media MEDIA
Path to WhatsApp media folder (default: WhatsApp)
-b BACKUP, --backup BACKUP
Path to Android (must be used together with -k)/iPhone WhatsApp backup
-o OUTPUT, --output OUTPUT
Output to specific directory (default: result)
-j [JSON], --json [JSON]
Save the result to a single JSON file (default if present: result.json)
-d DB, --db DB Path to database file (default: msgstore.db/7c7fba66680ef796b916b067077cc246adacf01d)
-k KEY, --key KEY Path to key file
-t TEMPLATE, --template TEMPLATE
Path to custom HTML template
-e, --embedded Embed media into HTML file (not yet implemented)
-s, --showkey Show the HEX key used to decrypt the database
-c, --move-media Move the media directory to output directory if the flag is set, otherwise copy it
--offline OFFLINE Relative path to offline static files
--size SIZE, --output-size SIZE
Maximum size of a single output file in bytes, 0 for auto (not yet implemented)
--no-html Do not output html files
--check-update Check for updates
```
# To do
1. Convert ```\r\n``` to ```<br>```
2. Reply in iPhone
3. The CSS for metadata (e.g. {Message Deleted})
4. Handle encrypted iPhone Backup
See [issues](https://github.com/KnugiHK/Whatsapp-Chat-Exporter/issues).
# Copyright
This is a MIT licensed project.

View File

@@ -0,0 +1 @@
__version__ = "0.9.1"

View File

@@ -0,0 +1,287 @@
try:
from .__init__ import __version__
except ImportError:
from Whatsapp_Chat_Exporter.__init__ import __version__
from Whatsapp_Chat_Exporter import extract, extract_iphone
from Whatsapp_Chat_Exporter import extract_iphone_media
from Whatsapp_Chat_Exporter.data_model import ChatStore
from Whatsapp_Chat_Exporter.utility import Crypt, check_update
from argparse import ArgumentParser
import os
import sqlite3
import shutil
import json
import string
from sys import exit
def main():
parser = ArgumentParser(
description = 'A customizable Android and iPhone WhatsApp database parser that '
'will give you the history of your WhatsApp conversations inHTML '
'and JSON. Android Backup Crypt12, Crypt14 and Crypt15 supported.',
epilog = f'WhatsApp Chat Exporter: {__version__} Licensed with MIT'
)
parser.add_argument(
'-a',
'--android',
dest='android',
default=False,
action='store_true',
help="Define the target as Android")
parser.add_argument(
'-i',
'--iphone',
'--ios',
dest='iphone',
default=False,
action='store_true',
help="Define the target as iPhone/iPad")
parser.add_argument(
"-w",
"--wa",
dest="wa",
default=None,
help="Path to contact database (default: wa.db/ContactsV2.sqlite)")
parser.add_argument(
"-m",
"--media",
dest="media",
default=None,
help="Path to WhatsApp media folder (default: WhatsApp)")
parser.add_argument(
"-b",
"--backup",
dest="backup",
default=None,
help="Path to Android (must be used together "
"with -k)/iPhone WhatsApp backup")
parser.add_argument(
"-o",
"--output",
dest="output",
default="result",
help="Output to specific directory (default: result)")
parser.add_argument(
'-j',
'--json',
dest='json',
nargs='?',
default=None,
type=str,
const="result.json",
help="Save the result to a single JSON file (default if present: result.json)")
parser.add_argument(
'-d',
'--db',
dest='db',
default=None,
help="Path to database file (default: msgstore.db/"
"7c7fba66680ef796b916b067077cc246adacf01d)")
parser.add_argument(
'-k',
'--key',
dest='key',
default=None,
help="Path to key file"
)
parser.add_argument(
"-t",
"--template",
dest="template",
default=None,
help="Path to custom HTML template"
)
parser.add_argument(
"-e",
"--embedded",
dest="embedded",
default=False,
action='store_true',
help="Embed media into HTML file (not yet implemented)"
)
parser.add_argument(
"-s",
"--showkey",
dest="showkey",
default=False,
action='store_true',
help="Show the HEX key used to decrypt the database"
)
parser.add_argument(
"-c",
"--move-media",
dest="move_media",
default=False,
action='store_true',
help="Move the media directory to output directory if the flag is set, otherwise copy it"
)
parser.add_argument(
"--offline",
dest="offline",
default=None,
help="Relative path to offline static files"
)
parser.add_argument(
"--size",
"--output-size",
dest="size",
default=None,
help="Maximum size of a single output file in bytes, 0 for auto (not yet implemented)"
)
parser.add_argument(
"--no-html",
dest="no_html",
default=False,
action='store_true',
help="Do not output html files"
)
parser.add_argument(
"--check-update",
dest="check_update",
default=False,
action='store_true',
help="Check for updates"
)
args = parser.parse_args()
# Check for updates
if args.check_update:
exit(check_update())
# Sanity checks
if args.android and args.iphone:
print("You must define only one device type.")
exit(1)
if not args.android and not args.iphone:
print("You must define the device type.")
exit(1)
if args.no_html and not args.json:
print("You must either specify a JSON output file or enable HTML output.")
exit(1)
data = {}
if args.android:
contacts = extract.contacts
messages = extract.messages
media = extract.media
vcard = extract.vcard
create_html = extract.create_html
if args.db is None:
msg_db = "msgstore.db"
else:
msg_db = args.db
if args.key is not None:
if args.backup is None:
print("You must specify the backup file with -b")
exit(1)
print("Decryption key specified, decrypting WhatsApp backup...")
if "crypt12" in args.backup:
crypt = Crypt.CRYPT12
elif "crypt14" in args.backup:
crypt = Crypt.CRYPT14
elif "crypt15" in args.backup:
crypt = Crypt.CRYPT15
if os.path.isfile(args.key):
key = open(args.key, "rb")
elif all(char in string.hexdigits for char in args.key):
key = bytes.fromhex(args.key)
db = open(args.backup, "rb").read()
error = extract.decrypt_backup(db, key, msg_db, crypt, args.showkey)
if error != 0:
if error == 1:
print("Dependencies of decrypt_backup and/or extract_encrypted_key"
" are not present. For details, see README.md.")
exit(3)
elif error == 2:
print("Failed when decompressing the decrypted backup. "
"Possibly incorrect offsets used in decryption.")
exit(4)
else:
print("Unknown error occurred.", error)
exit(5)
if args.wa is None:
contact_db = "wa.db"
else:
contact_db = args.wa
if args.media is None:
args.media = "WhatsApp"
if os.path.isfile(contact_db):
with sqlite3.connect(contact_db) as db:
db.row_factory = sqlite3.Row
contacts(db, data)
elif args.iphone:
import sys
if "--iphone" in sys.argv:
print("WARNING: The --iphone flag is deprecated and will be removed in the future. Use --ios instead.")
messages = extract_iphone.messages
media = extract_iphone.media
vcard = extract_iphone.vcard
create_html = extract_iphone.create_html
if args.backup is not None:
extract_iphone_media.extract_media(args.backup)
if args.db is None:
msg_db = "7c7fba66680ef796b916b067077cc246adacf01d"
else:
msg_db = args.db
if args.wa is None:
contact_db = "ContactsV2.sqlite"
else:
contact_db = args.wa
if args.media is None:
args.media = "Message"
if os.path.isfile(msg_db):
with sqlite3.connect(msg_db) as db:
db.row_factory = sqlite3.Row
messages(db, data)
media(db, data, args.media)
vcard(db, data)
if not args.no_html:
create_html(
data,
args.output,
args.template,
args.embedded,
args.offline,
args.size
)
else:
print(
"The message database does not exist. You may specify the path "
"to database file with option -d or check your provided path."
)
exit(2)
if os.path.isdir(args.media):
if os.path.isdir(f"{args.output}/{args.media}"):
print("Media directory already exists in output directory. Skipping...")
else:
if not args.move_media:
print("Copying media directory...")
shutil.copytree(args.media, f"{args.output}/WhatsApp")
else:
try:
shutil.move(args.media, f"{args.output}/")
except PermissionError:
print("Cannot remove original WhatsApp directory. "
"Perhaps the directory is opened?")
if args.json:
if isinstance(data[next(iter(data))], ChatStore):
data = {jik: chat.to_json() for jik, chat in data.items()}
with open(args.json, "w") as f:
data = json.dumps(data)
print(f"\nWriting JSON file...({int(len(data)/1024/1024)}MB)")
f.write(data)
else:
print()
print("Everything is done!")
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,53 @@
from datetime import datetime
from typing import Union
class ChatStore():
def __init__(self, name=None):
if name is not None and not isinstance(name, str):
raise TypeError("Name must be a string or None")
self.name = name
self.messages = {}
def add_message(self, id, message):
if not isinstance(message, Message):
raise TypeError("Chat must be a Chat object")
self.messages[id] = message
def delete_message(self, id):
if id in self.messages:
del self.messages[id]
def to_json(self):
serialized_msgs = {id : msg.to_json() for id,msg in self.messages.items()}
return {'name' : self.name, 'messages' : serialized_msgs}
class Message():
def __init__(self, from_me: Union[bool,int], timestamp: int, time: str, key_id: int):
self.from_me = bool(from_me)
self.timestamp = timestamp / 1000 if timestamp > 9999999999 else timestamp
self.time = datetime.fromtimestamp(time/1000).strftime("%H:%M")
self.media = False
self.key_id = key_id
self.meta = False
self.data = None
self.sender = None
# Extra
self.reply = None
self.quoted_data = None
self.caption = None
def to_json(self):
return {
'from_me' : self.from_me,
'timestamp' : self.timestamp,
'time' : self.time,
'media' : self.media,
'key_id' : self.key_id,
'meta' : self.meta,
'data' : self.data,
'sender' : self.sender,
'reply' : self.reply,
'quoted_data' : self.quoted_data,
'caption' : self.caption
}

View File

@@ -0,0 +1,631 @@
#!/usr/bin/python3
import sqlite3
import json
import jinja2
import os
import shutil
import re
import io
import hmac
from pathlib import Path
from mimetypes import MimeTypes
from hashlib import sha256
from Whatsapp_Chat_Exporter.data_model import ChatStore, Message
from Whatsapp_Chat_Exporter.utility import sanitize_except, determine_day, Crypt
from Whatsapp_Chat_Exporter.utility import brute_force_offset, CRYPT14_OFFSETS
try:
import zlib
from Crypto.Cipher import AES
except ModuleNotFoundError:
support_backup = False
else:
support_backup = True
try:
import javaobj
except ModuleNotFoundError:
support_crypt15 = False
else:
support_crypt15 = True
def _generate_hmac_of_hmac(key_stream):
key = hmac.new(
hmac.new(
b'\x00' * 32,
key_stream,
sha256
).digest(),
b"backup encryption\x01",
sha256
)
return key.digest(), key_stream
def _extract_encrypted_key(keyfile):
key_stream = b""
for byte in javaobj.loads(keyfile):
key_stream += byte.to_bytes(1, "big", signed=True)
return _generate_hmac_of_hmac(key_stream)
def decrypt_backup(database, key, output, crypt=Crypt.CRYPT14, show_crypt15=False):
if not support_backup:
return 1
if isinstance(key, io.IOBase):
key = key.read()
if crypt is not Crypt.CRYPT15:
t1 = key[30:62]
if crypt is not Crypt.CRYPT15 and len(key) != 158:
raise ValueError("The key file must be 158 bytes")
# Determine the IV and database offsets
if crypt == Crypt.CRYPT14:
if len(database) < 191:
raise ValueError("The crypt14 file must be at least 191 bytes")
current_try = 0
offsets = CRYPT14_OFFSETS[current_try]
t2 = database[15:47]
iv = database[offsets["iv"]:offsets["iv"] + 16]
db_ciphertext = database[offsets["db"]:]
elif crypt == Crypt.CRYPT12:
if len(database) < 67:
raise ValueError("The crypt12 file must be at least 67 bytes")
t2 = database[3:35]
iv = database[51:67]
db_ciphertext = database[67:-20]
elif crypt == Crypt.CRYPT15:
if not support_crypt15:
return 1
if len(database) < 131:
raise ValueError("The crypt15 file must be at least 131 bytes")
t1 = t2 = None
iv = database[8:24]
db_offset = database[0] + 2 # Skip protobuf + protobuf size and backup type
db_ciphertext = database[db_offset:]
if t1 != t2:
raise ValueError("The signature of key file and backup file mismatch")
if crypt == Crypt.CRYPT15:
if len(key) == 32:
main_key, hex_key = _generate_hmac_of_hmac(key)
else:
main_key, hex_key = _extract_encrypted_key(key)
if show_crypt15:
hex_key = [hex_key.hex()[c:c+4] for c in range(0, len(hex_key.hex()), 4)]
print("The HEX key of the crypt15 backup is: " + ' '.join(hex_key))
else:
main_key = key[126:]
decompressed = False
while not decompressed:
cipher = AES.new(main_key, AES.MODE_GCM, iv)
db_compressed = cipher.decrypt(db_ciphertext)
try:
db = zlib.decompress(db_compressed)
except zlib.error:
if crypt == Crypt.CRYPT14:
current_try += 1
if current_try < len(CRYPT14_OFFSETS):
offsets = CRYPT14_OFFSETS[current_try]
iv = database[offsets["iv"]:offsets["iv"] + 16]
db_ciphertext = database[offsets["db"]:]
continue
else:
print("Common offsets are not applicable to "
"your backup. Trying to brute force it...")
for start_iv, end_iv, start_db in brute_force_offset():
iv = database[start_iv:end_iv]
db_ciphertext = database[start_db:]
cipher = AES.new(main_key, AES.MODE_GCM, iv)
db_compressed = cipher.decrypt(db_ciphertext)
try:
db = zlib.decompress(db_compressed)
except zlib.error:
continue
else:
decompressed = True
print(
f"The offsets of your IV and database are {start_iv} and "
f"{start_db}, respectively. To include your offsets in the "
"program, please report it by creating an issue on GitHub: "
"https://github.com/KnugiHK/Whatsapp-Chat-Exporter/issues/new"
)
break
if not decompressed:
return 2
else:
return 3
else:
decompressed = True
if db[0:6].upper() == b"SQLITE":
with open(output, "wb") as f:
f.write(db)
return 0
else:
raise ValueError("The plaintext is not a SQLite database. Did you use the key to encrypt something...")
def contacts(db, data):
# Get contacts
c = db.cursor()
c.execute("""SELECT count() FROM wa_contacts""")
total_row_number = c.fetchone()[0]
print(f"Gathering contacts...({total_row_number})")
c.execute("""SELECT jid, display_name FROM wa_contacts; """)
row = c.fetchone()
while row is not None:
data[row["jid"]] = ChatStore(row["display_name"])
row = c.fetchone()
def messages(db, data):
# Get message history
c = db.cursor()
try:
c.execute("""SELECT count() FROM messages""")
except sqlite3.OperationalError:
c.execute("""SELECT count() FROM message""")
total_row_number = c.fetchone()[0]
print(f"Gathering messages...(0/{total_row_number})", end="\r")
phone_number_re = re.compile(r"[0-9]+@s.whatsapp.net")
try:
c.execute("""SELECT messages.key_remote_jid,
messages._id,
messages.key_from_me,
messages.timestamp,
messages.data,
messages.status,
messages.edit_version,
messages.thumb_image,
messages.remote_resource,
messages.media_wa_type,
messages.latitude,
messages.longitude,
messages_quotes.key_id as quoted,
messages.key_id,
messages_quotes.data as quoted_data,
messages.media_caption
FROM messages
LEFT JOIN messages_quotes
ON messages.quoted_row_id = messages_quotes._id
WHERE messages.key_remote_jid <> '-1';"""
)
except sqlite3.OperationalError:
try:
c.execute("""SELECT jid_global.raw_string as key_remote_jid,
message._id,
message.from_me as key_from_me,
message.timestamp,
message.text_data as data,
message.status,
message_future.version as edit_version,
message_thumbnail.thumbnail as thumb_image,
message_media.file_path as remote_resource,
message_media.mime_type as media_wa_type,
message_location.latitude,
message_location.longitude,
message_quoted.key_id as quoted,
message.key_id,
message_quoted.text_data as quoted_data,
message.message_type,
jid_group.raw_string as group_sender_jid,
chat.subject as chat_subject
FROM message
LEFT JOIN message_quoted
ON message_quoted.message_row_id = message._id
LEFT JOIN message_location
ON message_location.message_row_id = message._id
LEFT JOIN message_media
ON message_media.message_row_id = message._id
LEFT JOIN message_thumbnail
ON message_thumbnail.message_row_id = message._id
LEFT JOIN message_future
ON message_future.message_row_id = message._id
LEFT JOIN chat
ON chat._id = message.chat_row_id
INNER JOIN jid jid_global
ON jid_global._id = chat.jid_row_id
LEFT JOIN jid jid_group
ON jid_group._id = message.sender_jid_row_id
WHERE key_remote_jid <> '-1';"""
)
except Exception as e:
raise e
else:
table_message = True
else:
table_message = False
i = 0
content = c.fetchone()
while content is not None:
if content["key_remote_jid"] not in data:
data[content["key_remote_jid"]] = ChatStore()
if content["key_remote_jid"] is None:
continue # Not sure
data[content["key_remote_jid"]].add_message(content["_id"], Message(
from_me=content["key_from_me"],
timestamp=content["timestamp"],
time=content["timestamp"],
key_id=content["key_id"],
))
if "-" in content["key_remote_jid"] and content["key_from_me"] == 0:
name = None
if table_message:
if content["chat_subject"] is not None:
_jid = content["group_sender_jid"]
else:
_jid = content["key_remote_jid"]
if _jid in data:
name = data[_jid].name
fallback = _jid.split('@')[0] if "@" in _jid else None
else:
fallback = None
else:
if content["remote_resource"] in data:
name = data[content["remote_resource"]].name
if "@" in content["remote_resource"]:
fallback = content["remote_resource"].split('@')[0]
else:
fallback = None
else:
fallback = None
data[content["key_remote_jid"]].messages[content["_id"]].sender = name or fallback
else:
data[content["key_remote_jid"]].messages[content["_id"]].sender = None
if content["quoted"] is not None:
data[content["key_remote_jid"]].messages[content["_id"]].reply = content["quoted"]
data[content["key_remote_jid"]].messages[content["_id"]].quoted_data = content["quoted_data"]
else:
data[content["key_remote_jid"]].messages[content["_id"]].reply = None
if not table_message and content["media_caption"] is not None:
# Old schema
data[content["key_remote_jid"]].messages[content["_id"]].caption = content["media_caption"]
elif table_message and content["message_type"] == 1 and content["data"] is not None:
# New schema
data[content["key_remote_jid"]].messages[content["_id"]].caption = content["data"]
else:
data[content["key_remote_jid"]].messages[content["_id"]].caption = None
if content["status"] == 6: # 6 = Metadata, otherwise it's a message
if (not table_message and "-" in content["key_remote_jid"]) or \
(table_message and content["chat_subject"] is not None):
# Is Group
if content["data"] is not None:
try:
int(content["data"])
except ValueError:
msg = f"The group name changed to {content['data']}"
data[content["key_remote_jid"]].messages[content["_id"]].data = msg
data[content["key_remote_jid"]].messages[content["_id"]].meta = True
else:
data[content["key_remote_jid"]].delete_message(content["_id"])
else:
thumb_image = content["thumb_image"]
if thumb_image is not None:
if b"\x00\x00\x01\x74\x00\x1A" in thumb_image:
# Add user
added = phone_number_re.search(
thumb_image.decode("unicode_escape"))[0]
if added in data:
name_right = data[added].name
else:
name_right = added.split('@')[0]
if content["remote_resource"] is not None:
if content["remote_resource"] in data:
name_left = data[content["remote_resource"]].name
else:
name_left = content["remote_resource"].split('@')[0]
msg = f"{name_left} added {name_right or 'You'}"
else:
msg = f"Added {name_right or 'You'}"
elif b"\xac\xed\x00\x05\x74\x00" in thumb_image:
# Changed number
original = content["remote_resource"].split('@')[0]
changed = thumb_image[7:].decode().split('@')[0]
msg = f"{original} changed to {changed}"
data[content["key_remote_jid"]].messages[content["_id"]].data = msg
data[content["key_remote_jid"]].messages[content["_id"]].meta = True
else:
if content["data"] is None:
data[content["key_remote_jid"]].delete_message(content["_id"])
else:
# Private chat
if content["data"] is None and content["thumb_image"] is None:
data[content["key_remote_jid"]].delete_message(content["_id"])
else:
if content["key_from_me"] == 1:
if content["status"] == 5 and content["edit_version"] == 7 or table_message and content["message_type"] == 15:
msg = "Message deleted"
data[content["key_remote_jid"]].messages[content["_id"]].meta = True
else:
if content["media_wa_type"] == "5":
msg = f"Location shared: {content['latitude'], content['longitude']}"
data[content["key_remote_jid"]].messages[content["_id"]].meta = True
else:
msg = content["data"]
if msg is not None:
if "\r\n" in msg:
msg = msg.replace("\r\n", "<br>")
if "\n" in msg:
msg = msg.replace("\n", "<br>")
else:
if content["status"] == 0 and content["edit_version"] == 7 or table_message and content["message_type"] == 15:
msg = "Message deleted"
data[content["key_remote_jid"]].messages[content["_id"]].meta = True
else:
if content["media_wa_type"] == "5":
msg = f"Location shared: {content['latitude'], content['longitude']}"
data[content["key_remote_jid"]].messages[content["_id"]].meta = True
else:
msg = content["data"]
if msg is not None:
if "\r\n" in msg:
msg = msg.replace("\r\n", "<br>")
if "\n" in msg:
msg = msg.replace("\n", "<br>")
data[content["key_remote_jid"]].messages[content["_id"]].data = msg
i += 1
if i % 1000 == 0:
print(f"Gathering messages...({i}/{total_row_number})", end="\r")
content = c.fetchone()
print(f"Gathering messages...({total_row_number}/{total_row_number})", end="\r")
def media(db, data, media_folder):
# Get media
c = db.cursor()
c.execute("""SELECT count() FROM message_media""")
total_row_number = c.fetchone()[0]
print(f"\nGathering media...(0/{total_row_number})", end="\r")
i = 0
try:
c.execute("""SELECT messages.key_remote_jid,
message_row_id,
file_path,
message_url,
mime_type,
media_key
FROM message_media
INNER JOIN messages
ON message_media.message_row_id = messages._id
ORDER BY messages.key_remote_jid ASC"""
)
except sqlite3.OperationalError:
c.execute("""SELECT jid.raw_string as key_remote_jid,
message_row_id,
file_path,
message_url,
mime_type,
media_key
FROM message_media
INNER JOIN message
ON message_media.message_row_id = message._id
LEFT JOIN chat
ON chat._id = message.chat_row_id
INNER JOIN jid
ON jid._id = chat.jid_row_id
ORDER BY jid.raw_string ASC"""
)
content = c.fetchone()
mime = MimeTypes()
while content is not None:
file_path = f"{media_folder}/{content['file_path']}"
data[content["key_remote_jid"]].messages[content["message_row_id"]].media = True
if os.path.isfile(file_path):
data[content["key_remote_jid"]].messages[content["message_row_id"]].data = file_path
if content["mime_type"] is None:
guess = mime.guess_type(file_path)[0]
if guess is not None:
data[content["key_remote_jid"]].messages[content["message_row_id"]].mime = guess
else:
data[content["key_remote_jid"]].messages[content["message_row_id"]].mime = "data/data"
else:
data[content["key_remote_jid"]].messages[content["message_row_id"]].mime = content["mime_type"]
else:
# if "https://mmg" in content[4]:
# try:
# r = requests.get(content[3])
# if r.status_code != 200:
# raise RuntimeError()
# except:
# data[content[0]]["messages"][content[1]]["data"] = "{The media is missing}"
# data[content[0]]["messages"][content[1]]["media"] = True
# data[content[0]]["messages"][content[1]]["mime"] = "media"
# else:
data[content["key_remote_jid"]].messages[content["message_row_id"]].data = "The media is missing"
data[content["key_remote_jid"]].messages[content["message_row_id"]].mime = "media"
data[content["key_remote_jid"]].messages[content["message_row_id"]].meta = True
i += 1
if i % 100 == 0:
print(f"Gathering media...({i}/{total_row_number})", end="\r")
content = c.fetchone()
print(
f"Gathering media...({total_row_number}/{total_row_number})", end="\r")
def vcard(db, data):
c = db.cursor()
try:
c.execute("""SELECT message_row_id,
messages.key_remote_jid,
vcard,
messages.media_name
FROM messages_vcards
INNER JOIN messages
ON messages_vcards.message_row_id = messages._id
ORDER BY messages.key_remote_jid ASC;"""
)
except sqlite3.OperationalError:
c.execute("""SELECT message_row_id,
jid.raw_string as key_remote_jid,
vcard,
message.text_data as media_name
FROM message_vcard
INNER JOIN message
ON message_vcard.message_row_id = message._id
LEFT JOIN chat
ON chat._id = message.chat_row_id
INNER JOIN jid
ON jid._id = chat.jid_row_id
ORDER BY message.chat_row_id ASC;"""
)
rows = c.fetchall()
total_row_number = len(rows)
print(f"\nGathering vCards...(0/{total_row_number})", end="\r")
base = "WhatsApp/vCards"
if not os.path.isdir(base):
Path(base).mkdir(parents=True, exist_ok=True)
for index, row in enumerate(rows):
media_name = row["media_name"] if row["media_name"] is not None else ""
file_name = "".join(x for x in media_name if x.isalnum())
file_path = f"{base}/{file_name}.vcf"
if not os.path.isfile(file_path):
with open(file_path, "w", encoding="utf-8") as f:
f.write(row["vcard"])
data[row["key_remote_jid"]].messages[row["message_row_id"]].data = media_name + \
"The vCard file cannot be displayed here, " \
f"however it should be located at {file_path}"
data[row["key_remote_jid"]].messages[row["message_row_id"]].mime = "text/x-vcard"
data[row["key_remote_jid"]].messages[row["message_row_id"]].meta = True
print(f"Gathering vCards...({index + 1}/{total_row_number})", end="\r")
def create_html(
data,
output_folder,
template=None,
embedded=False,
offline_static=False,
maximum_size=None
):
if template is None:
template_dir = os.path.dirname(__file__)
template_file = "whatsapp.html"
else:
template_dir = os.path.dirname(template)
template_file = os.path.basename(template)
templateLoader = jinja2.FileSystemLoader(searchpath=template_dir)
templateEnv = jinja2.Environment(loader=templateLoader)
templateEnv.globals.update(determine_day=determine_day)
templateEnv.filters['sanitize_except'] = sanitize_except
template = templateEnv.get_template(template_file)
total_row_number = len(data)
print(f"\nCreating HTML...(0/{total_row_number})", end="\r")
if not os.path.isdir(output_folder):
os.mkdir(output_folder)
w3css = "https://www.w3schools.com/w3css/4/w3.css"
if offline_static:
import urllib.request
static_folder = os.path.join(output_folder, offline_static)
if not os.path.isdir(static_folder):
os.mkdir(static_folder)
w3css_path = os.path.join(static_folder, "w3.css")
if not os.path.isfile(w3css_path):
with urllib.request.urlopen(w3css) as resp:
with open(w3css_path, "wb") as f:
f.write(resp.read())
w3css = os.path.join(offline_static, "w3.css")
for current, contact in enumerate(data):
if len(data[contact].messages) == 0:
continue
phone_number = contact.split('@')[0]
if "-" in contact:
file_name = ""
else:
file_name = phone_number
if data[contact].name is not None:
if file_name != "":
file_name += "-"
file_name += data[contact].name.replace("/", "-")
name = data[contact].name
else:
name = phone_number
safe_file_name = ''
safe_file_name = "".join(x for x in file_name if x.isalnum() or x in "- ")
with open(f"{output_folder}/{safe_file_name}.html", "w", encoding="utf-8") as f:
f.write(
template.render(
name=name,
msgs=data[contact].messages.values(),
my_avatar=None,
their_avatar=f"WhatsApp/Avatars/{contact}.j",
w3css=w3css
)
)
if current % 10 == 0:
print(f"Creating HTML...({current}/{total_row_number})", end="\r")
print(f"Creating HTML...({total_row_number}/{total_row_number})", end="\r")
if __name__ == "__main__":
from optparse import OptionParser
parser = OptionParser()
parser.add_option(
"-w",
"--wa",
dest="wa",
default="wa.db",
help="Path to contact database")
parser.add_option(
"-m",
"--media",
dest="media",
default="WhatsApp",
help="Path to WhatsApp media folder"
)
# parser.add_option(
# "-t",
# "--template",
# dest="html",
# default="wa.db",
# help="Path to HTML template")
(options, args) = parser.parse_args()
msg_db = "msgstore.db"
output_folder = "temp"
contact_db = options.wa
media_folder = options.media
if len(args) == 1:
msg_db = args[0]
elif len(args) == 2:
msg_db = args[0]
output_folder = args[1]
data = {}
if os.path.isfile(contact_db):
with sqlite3.connect(contact_db) as db:
contacts(db, data)
if os.path.isfile(msg_db):
with sqlite3.connect(msg_db) as db:
messages(db, data)
media(db, data, media_folder)
vcard(db, data)
create_html(data, output_folder)
if not os.path.isdir(f"{output_folder}/WhatsApp"):
shutil.move(media_folder, f"{output_folder}/")
with open("result.json", "w") as f:
data = json.dumps(data)
print(f"\nWriting JSON file...({int(len(data)/1024/1024)}MB)")
f.write(data)
print("Everything is done!")

View File

@@ -4,21 +4,11 @@ import sqlite3
import json
import jinja2
import os
import requests
import shutil
from pathlib import Path
from datetime import datetime
from mimetypes import MimeTypes
APPLE_TIME = datetime.timestamp(datetime(2001, 1, 1))
def determine_day(last, current):
last = datetime.fromtimestamp(last).date()
current = datetime.fromtimestamp(current).date()
if last == current:
return None
else:
return current
from Whatsapp_Chat_Exporter.utility import sanitize_except, determine_day, APPLE_TIME
def messages(db, data):
@@ -61,7 +51,9 @@ def messages(db, data):
"time": datetime.fromtimestamp(ts).strftime("%H:%M"),
"media": False,
"reply": None,
"caption": None
"caption": None,
"meta": False,
"data": None
}
if "-" in content[0] and content[2] == 0:
name = None
@@ -86,8 +78,9 @@ def messages(db, data):
try:
int(content[4])
except ValueError:
msg = "{The group name changed to "f"{content[4]}"" }"
msg = f"The group name changed to {content[4]}"
data[content[0]]["messages"][content[1]]["data"] = msg
data[content[0]]["messages"][content[1]]["meta"] = True
else:
del data[content[0]]["messages"][content[1]]
else:
@@ -98,14 +91,26 @@ def messages(db, data):
# real message
if content[2] == 1:
if content[5] == 14:
msg = "{Message deleted}"
msg = "Message deleted"
data[content[0]]["messages"][content[1]]["meta"] = True
else:
msg = content[4]
if msg is not None:
if "\r\n" in msg:
msg = msg.replace("\r\n", "<br>")
if "\n" in msg:
msg = msg.replace("\n", "<br>")
else:
if content[5] == 14:
msg = "{Message deleted}"
msg = "Message deleted"
data[content[0]]["messages"][content[1]]["meta"] = True
else:
msg = content[4]
if msg is not None:
if "\r\n" in msg:
msg = msg.replace("\r\n", "<br>")
if "\n" in msg:
msg = msg.replace("\n", "<br>")
data[content[0]]["messages"][content[1]]["data"] = msg
i += 1
if i % 1000 == 0:
@@ -137,7 +142,7 @@ def media(db, data, media_folder):
content = c.fetchone()
mime = MimeTypes()
while content is not None:
file_path = f"Message/{content[2]}"
file_path = f"{media_folder}/{content[2]}"
data[content[0]]["messages"][content[1]]["media"] = True
if os.path.isfile(file_path):
@@ -160,8 +165,9 @@ def media(db, data, media_folder):
# data[content[0]]["messages"][content[1]]["data"] = "{The media is missing}"
# data[content[0]]["messages"][content[1]]["mime"] = "media"
# else:
data[content[0]]["messages"][content[1]]["data"] = "{The media is missing}"
data[content[0]]["messages"][content[1]]["data"] = "The media is missing"
data[content[0]]["messages"][content[1]]["mime"] = "media"
data[content[0]]["messages"][content[1]]["meta"] = True
if content[6] is not None:
data[content[0]]["messages"][content[1]]["caption"] = content[6]
i += 1
@@ -189,28 +195,35 @@ def vcard(db, data):
total_row_number = len(rows)
print(f"\nGathering vCards...(0/{total_row_number})", end="\r")
base = "Message/vCards"
if not os.path.isdir(base):
Path(base).mkdir(parents=True, exist_ok=True)
for index, row in enumerate(rows):
if not os.path.isdir(base):
os.mkdir(base)
file_name = "".join(x for x in row[3] if x.isalnum())
file_path = f"{base}/{file_name[:200]}.vcf"
if not os.path.isfile(file_path):
with open(file_path, "w", encoding="utf-8") as f:
f.write(row[4])
data[row[2]]["messages"][row[1]]["data"] = row[3] + \
"{ The vCard file cannot be displayed here, however it " \
"should be located at " + file_path + "}"
"The vCard file cannot be displayed here, " \
f"however it should be located at {file_path}"
data[row[2]]["messages"][row[1]]["mime"] = "text/x-vcard"
data[row[2]]["messages"][row[1]]["media"] = True
data[row[2]]["messages"][row[1]]["meta"] = True
print(f"Gathering vCards...({index + 1}/{total_row_number})", end="\r")
def create_html(data, output_folder):
templateLoader = jinja2.FileSystemLoader(searchpath="./")
def create_html(data, output_folder, template=None, embedded=False, offline_static=False, maximum_size=None):
if template is None:
template_dir = os.path.dirname(__file__)
template_file = "whatsapp.html"
else:
template_dir = os.path.dirname(template)
template_file = os.path.basename(template)
templateLoader = jinja2.FileSystemLoader(searchpath=template_dir)
templateEnv = jinja2.Environment(loader=templateLoader)
templateEnv.globals.update(determine_day=determine_day)
TEMPLATE_FILE = "whatsapp.html"
template = templateEnv.get_template(TEMPLATE_FILE)
templateEnv.filters['sanitize_except'] = sanitize_except
template = templateEnv.get_template(template_file)
total_row_number = len(data)
print(f"\nCreating HTML...(0/{total_row_number})", end="\r")
@@ -218,6 +231,18 @@ def create_html(data, output_folder):
if not os.path.isdir(output_folder):
os.mkdir(output_folder)
w3css = "https://www.w3schools.com/w3css/4/w3.css"
if offline_static:
import urllib.request
static_folder = os.path.join(output_folder, offline_static)
if not os.path.isdir(static_folder):
os.mkdir(static_folder)
w3css_path = os.path.join(static_folder, "w3.css")
if not os.path.isfile(w3css_path):
with urllib.request.urlopen(w3css) as resp:
with open(w3css_path, "wb") as f: f.write(resp.read())
w3css = os.path.join(offline_static, "w3.css")
for current, contact in enumerate(data):
if len(data[contact]["messages"]) == 0:
continue
@@ -243,7 +268,8 @@ def create_html(data, output_folder):
name=name,
msgs=data[contact]["messages"].values(),
my_avatar=None,
their_avatar=f"WhatsApp/Avatars/{contact}.j"
their_avatar=f"WhatsApp/Avatars/{contact}.j",
w3css=w3css
)
)
if current % 10 == 0:

View File

@@ -0,0 +1,136 @@
#!/usr/bin/python3
import shutil
import sqlite3
import os
import getpass
try:
from iphone_backup_decrypt import EncryptedBackup, RelativePath
except ModuleNotFoundError:
support_encrypted = False
else:
support_encrypted = True
def extract_encrypted(base_dir, password):
backup = EncryptedBackup(backup_directory=base_dir, passphrase=password)
print("Decrypting WhatsApp database...")
backup.extract_file(relative_path=RelativePath.WHATSAPP_MESSAGES,
output_filename="7c7fba66680ef796b916b067077cc246adacf01d")
backup.extract_file(relative_path=RelativePath.WHATSAPP_CONTACTS,
output_filename="ContactsV2.sqlite")
data = backup.execute_sql("""SELECT count()
FROM Files
WHERE relativePath
LIKE 'Message/Media/%'"""
)
total_row_number = data[0][0]
print(f"Gathering media...(0/{total_row_number})", end="\r")
data = backup.execute_sql("""SELECT fileID,
relativePath,
flags,
file
FROM Files
WHERE relativePath
LIKE 'Message/Media/%'"""
)
if not os.path.isdir("Message"):
os.mkdir("Message")
if not os.path.isdir("Message/Media"):
os.mkdir("Message/Media")
i = 0
for row in data:
destination = row[1]
hashes = row[0]
folder = hashes[:2]
flags = row[2]
file = row[3]
if flags == 2:
try:
os.mkdir(destination)
except FileExistsError:
pass
elif flags == 1:
decrypted = backup.decrypt_inner_file(file_id=hashes, file_bplist=file)
with open(destination, "wb") as f:
f.write(decrypted)
i += 1
if i % 100 == 0:
print(f"Gathering media...({i}/{total_row_number})", end="\r")
print(f"Gathering media...({total_row_number}/{total_row_number})", end="\r")
def is_encrypted(base_dir):
with sqlite3.connect(f"{base_dir}/Manifest.db") as f:
c = f.cursor()
try:
c.execute("""SELECT count()
FROM Files
""")
except sqlite3.DatabaseError:
return True
else:
return False
def extract_media(base_dir):
if is_encrypted(base_dir):
if not support_encrypted:
print("You don't have the dependencies to handle encrypted backup.")
print("Read more on how to deal with encrypted backup:")
print("https://github.com/KnugiHK/Whatsapp-Chat-Exporter/blob/main/README.md#usage")
return False
password = getpass.getpass("Enter the password:")
extract_encrypted(base_dir, password)
else:
wts_db = os.path.join(base_dir, "7c/7c7fba66680ef796b916b067077cc246adacf01d")
if not os.path.isfile(wts_db):
print("WhatsApp database not found.")
exit()
else:
shutil.copyfile(wts_db, "7c7fba66680ef796b916b067077cc246adacf01d")
with sqlite3.connect(f"{base_dir}/Manifest.db") as manifest:
c = manifest.cursor()
c.execute("""SELECT count()
FROM Files
WHERE relativePath
LIKE 'Message/Media/%'""")
total_row_number = c.fetchone()[0]
print(f"Gathering media...(0/{total_row_number})", end="\r")
c.execute("""SELECT fileID,
relativePath,
flags
FROM Files
WHERE relativePath
LIKE 'Message/Media/%'""")
row = c.fetchone()
if not os.path.isdir("Message"):
os.mkdir("Message")
if not os.path.isdir("Message/Media"):
os.mkdir("Message/Media")
i = 0
while row is not None:
destination = row[1]
hashes = row[0]
folder = hashes[:2]
flags = row[2]
if flags == 2:
try:
os.mkdir(destination)
except FileExistsError:
pass
elif flags == 1:
shutil.copyfile(f"{base_dir}/{folder}/{hashes}", destination)
i += 1
if i % 100 == 0:
print(f"Gathering media...({i}/{total_row_number})", end="\r")
row = c.fetchone()
print(f"Gathering media...({total_row_number}/{total_row_number})", end="\r")
if __name__ == "__main__":
from optparse import OptionParser
parser = OptionParser()
(_, args) = parser.parse_args()
base_dir = args[0]
extract_media(base_dir)

View File

@@ -0,0 +1,588 @@
#!/usr/bin/python3
import sqlite3
import json
import jinja2
import os
import shutil
import re
import io
import hmac
from pathlib import Path
from bleach import clean as sanitize
from markupsafe import Markup
from datetime import datetime
from enum import Enum
from mimetypes import MimeTypes
from hashlib import sha256
from Whatsapp_Chat_Exporter.data_model import ChatStore, Message
try:
import zlib
from Crypto.Cipher import AES
except ModuleNotFoundError:
support_backup = False
else:
support_backup = True
try:
import javaobj
except ModuleNotFoundError:
support_crypt15 = False
else:
support_crypt15 = True
def sanitize_except(html):
return Markup(sanitize(html, tags=["br"]))
def determine_day(last, current):
last = datetime.fromtimestamp(last).date()
current = datetime.fromtimestamp(current).date()
if last == current:
return None
else:
return current
CRYPT14_OFFSETS = (
{"iv": 67, "db": 191},
{"iv": 67, "db": 190},
{"iv": 66, "db": 99},
{"iv": 67, "db": 193}
)
class Crypt(Enum):
CRYPT15 = 15
CRYPT14 = 14
CRYPT12 = 12
def brute_force_offset():
for iv in range(0, 200):
for db in range(0, 200):
yield iv, iv + 16, db
def _generate_hmac_of_hmac(key_stream):
key = hmac.new(
hmac.new(
b'\x00' * 32,
key_stream,
sha256
).digest(),
b"backup encryption\x01",
sha256
)
return key.digest(), key_stream
def _extract_encrypted_key(keyfile):
key_stream = b""
for byte in javaobj.loads(keyfile):
key_stream += byte.to_bytes(1, "big", signed=True)
return _generate_hmac_of_hmac(key_stream)
def decrypt_backup(database, key, output, crypt=Crypt.CRYPT14, show_crypt15=False):
if not support_backup:
return 1
if isinstance(key, io.IOBase):
key = key.read()
if crypt is not Crypt.CRYPT15:
t1 = key[30:62]
if crypt is not Crypt.CRYPT15 and len(key) != 158:
raise ValueError("The key file must be 158 bytes")
if crypt == Crypt.CRYPT14:
if len(database) < 191:
raise ValueError("The crypt14 file must be at least 191 bytes")
current_try = 0
offsets = CRYPT14_OFFSETS[current_try]
t2 = database[15:47]
iv = database[offsets["iv"]:offsets["iv"] + 16]
db_ciphertext = database[offsets["db"]:]
elif crypt == Crypt.CRYPT12:
if len(database) < 67:
raise ValueError("The crypt12 file must be at least 67 bytes")
t2 = database[3:35]
iv = database[51:67]
db_ciphertext = database[67:-20]
elif crypt == Crypt.CRYPT15:
if not support_crypt15:
return 1
if len(database) < 131:
raise ValueError("The crypt15 file must be at least 131 bytes")
t1 = t2 = None
iv = database[8:24]
db_offset = database[0] + 2 # Skip protobuf + protobuf size and backup type
db_ciphertext = database[db_offset:]
if t1 != t2:
raise ValueError("The signature of key file and backup file mismatch")
if crypt == Crypt.CRYPT15:
if len(key) == 32:
main_key, hex_key = _generate_hmac_of_hmac(key)
else:
main_key, hex_key = _extract_encrypted_key(key)
if show_crypt15:
hex_key = [hex_key.hex()[c:c+4] for c in range(0, len(hex_key.hex()), 4)]
print("The HEX key of the crypt15 backup is: " + ' '.join(hex_key))
else:
main_key = key[126:]
decompressed = False
while not decompressed:
cipher = AES.new(main_key, AES.MODE_GCM, iv)
db_compressed = cipher.decrypt(db_ciphertext)
try:
db = zlib.decompress(db_compressed)
except zlib.error:
if crypt == Crypt.CRYPT14:
current_try += 1
if current_try < len(CRYPT14_OFFSETS):
offsets = CRYPT14_OFFSETS[current_try]
iv = database[offsets["iv"]:offsets["iv"] + 16]
db_ciphertext = database[offsets["db"]:]
continue
else:
print("Common offsets are not applicable to "
"your backup. Trying to brute force it...")
for start_iv, end_iv, start_db in brute_force_offset():
iv = database[start_iv:end_iv]
db_ciphertext = database[start_db:]
cipher = AES.new(main_key, AES.MODE_GCM, iv)
db_compressed = cipher.decrypt(db_ciphertext)
try:
db = zlib.decompress(db_compressed)
except zlib.error:
continue
else:
decompressed = True
print(
f"The offsets of your IV and database are {start_iv} and "
f"{start_db}, respectively. To include your offsets in the "
"program, please report it by creating an issue on GitHub: "
"https://github.com/KnugiHK/Whatsapp-Chat-Exporter/issues/new"
)
break
if not decompressed:
return 2
else:
return 3
else:
decompressed = True
if db[0:6].upper() == b"SQLITE":
with open(output, "wb") as f:
f.write(db)
return 0
else:
raise ValueError("The plaintext is not a SQLite database. Did you use the key to encrypt something...")
def contacts(db, data):
# Get contacts
c = db.cursor()
c.execute("""SELECT count() FROM wa_contacts""")
total_row_number = c.fetchone()[0]
print(f"Gathering contacts...({total_row_number})")
c.execute("""SELECT jid, display_name FROM wa_contacts; """)
row = c.fetchone()
while row is not None:
data[row["jid"]] = ChatStore(row["display_name"])
row = c.fetchone()
def messages(db, data):
# Get message history
c = db.cursor()
c.execute("""SELECT count() FROM message""")
total_row_number = c.fetchone()[0]
print(f"Gathering messages...(0/{total_row_number})", end="\r")
phone_number_re = re.compile(r"[0-9]+@s.whatsapp.net")
c.execute("""SELECT jid_global.raw_string as key_remote_jid,
message._id,
message.from_me as key_from_me,
message.timestamp,
message.text_data as data,
message.status,
message_future.version as edit_version,
message_thumbnail.thumbnail as thumb_image,
message_media.file_path as remote_resource,
message_media.mime_type as media_wa_type,
message_location.latitude,
message_location.longitude,
message_quoted.key_id as quoted,
message.key_id,
message_quoted.text_data as quoted_data,
message.message_type,
jid_group.raw_string as group_sender_jid,
chat.subject as chat_subject
FROM message
LEFT JOIN message_quoted
ON message_quoted.message_row_id = message._id
LEFT JOIN message_location
ON message_location.message_row_id = message._id
LEFT JOIN message_media
ON message_media.message_row_id = message._id
LEFT JOIN message_thumbnail
ON message_thumbnail.message_row_id = message._id
LEFT JOIN message_future
ON message_future.message_row_id = message._id
LEFT JOIN chat
ON chat._id = message.chat_row_id
INNER JOIN jid jid_global
ON jid_global._id = chat.jid_row_id
LEFT JOIN jid jid_group
ON jid_group._id = message.sender_jid_row_id
WHERE key_remote_jid <> '-1';""")
i = 0
content = c.fetchone()
while content is not None:
if content["key_remote_jid"] not in data:
data[content["key_remote_jid"]] = ChatStore()
if content["key_remote_jid"] is None:
continue
data[content["key_remote_jid"]].add_message(content["_id"], Message(
from_me=content["key_from_me"],
timestamp=content["timestamp"],
time=content["timestamp"],
key_id=content["key_id"],
))
if "-" in content["key_remote_jid"] and content["key_from_me"] == 0:
name = None
if content["chat_subject"] is not None:
_jid = content["group_sender_jid"]
else:
_jid = content["key_remote_jid"]
if _jid in data:
name = data[_jid].name
fallback = _jid.split('@')[0] if "@" in _jid else None
else:
fallback = None
data[content["key_remote_jid"]].messages[content["_id"]].sender = name or fallback
else:
data[content["key_remote_jid"]].messages[content["_id"]].sender = None
if content["quoted"] is not None:
data[content["key_remote_jid"]].messages[content["_id"]].reply = content["quoted"]
data[content["key_remote_jid"]].messages[content["_id"]].quoted_data = content["quoted_data"]
else:
data[content["key_remote_jid"]].messages[content["_id"]].reply = None
if content["message_type"] == 1:
data[content["key_remote_jid"]].messages[content["_id"]].caption = content["data"]
else:
data[content["key_remote_jid"]].messages[content["_id"]].caption = None
if content["status"] == 6:
if content["chat_subject"] is not None:
# Is Group
if content["data"] is not None:
try:
int(content["data"])
except ValueError:
msg = f"The group name changed to {content['data']}"
data[content["key_remote_jid"]].messages[content["_id"]].data = msg
data[content["key_remote_jid"]].messages[content["_id"]].meta = True
else:
data[content["key_remote_jid"]].delete_message(content["_id"])
else:
thumb_image = content["thumb_image"]
if thumb_image is not None:
if b"\x00\x00\x01\x74\x00\x1A" in thumb_image:
# Add user
added = phone_number_re.search(
thumb_image.decode("unicode_escape"))[0]
if added in data:
name_right = data[added]["name"]
else:
name_right = added.split('@')[0]
if content["remote_resource"] is not None:
if content["remote_resource"] in data:
name_left = data[content["remote_resource"]]["name"]
else:
name_left = content["remote_resource"].split('@')[0]
msg = f"{name_left} added {name_right or 'You'}"
else:
msg = f"Added {name_right or 'You'}"
elif b"\xac\xed\x00\x05\x74\x00" in thumb_image:
# Changed number
original = content["remote_resource"].split('@')[0]
changed = thumb_image[7:].decode().split('@')[0]
msg = f"{original} changed to {changed}"
data[content["key_remote_jid"]].messages[content["_id"]].data = msg
data[content["key_remote_jid"]].messages[content["_id"]].meta = True
else:
if content["data"] is None:
data[content["key_remote_jid"]].delete_message(content["_id"])
else:
# Private chat
if content["data"] is None and content["thumb_image"] is None:
data[content["key_remote_jid"]].delete_message(content["_id"])
else:
if content["key_from_me"] == 1:
if content["status"] == 5 and content["edit_version"] == 7:
msg = "Message deleted"
data[content["key_remote_jid"]].messages[content["_id"]].meta = True
else:
if content["media_wa_type"] == "5":
msg = f"Location shared: {content[10], content[11]}"
data[content["key_remote_jid"]].messages[content["_id"]].meta = True
else:
msg = content["data"]
if msg is not None:
if "\r\n" in msg:
msg = msg.replace("\r\n", "<br>")
if "\n" in msg:
msg = msg.replace("\n", "<br>")
else:
if content["status"] == 0 and content["edit_version"] == 7:
msg = "Message deleted"
data[content["key_remote_jid"]].messages[content["_id"]].meta = True
else:
if content["media_wa_type"] == "5":
msg = f"Location shared: {content[10], content[11]}"
data[content["key_remote_jid"]].messages[content["_id"]].meta = True
else:
msg = content["data"]
if msg is not None:
if "\r\n" in msg:
msg = msg.replace("\r\n", "<br>")
if "\n" in msg:
msg = msg.replace("\n", "<br>")
data[content["key_remote_jid"]].messages[content["_id"]].data = msg
i += 1
if i % 1000 == 0:
print(f"Gathering messages...({i}/{total_row_number})", end="\r")
content = c.fetchone()
print(f"Gathering messages...({total_row_number}/{total_row_number})", end="\r")
def media(db, data, media_folder):
# Get media
c = db.cursor()
c.execute("""SELECT count() FROM message_media""")
total_row_number = c.fetchone()[0]
print(f"\nGathering media...(0/{total_row_number})", end="\r")
i = 0
c.execute("""SELECT jid.raw_string,
message_row_id,
file_path,
message_url,
mime_type,
media_key
FROM message_media
INNER JOIN message
ON message_media.message_row_id = message._id
LEFT JOIN chat
ON chat._id = message.chat_row_id
INNER JOIN jid
ON jid._id = chat.jid_row_id
ORDER BY jid.raw_string ASC""")
content = c.fetchone()
mime = MimeTypes()
while content is not None:
file_path = f"{media_folder}/{content['file_path']}"
data[content["raw_string"]].messages[content["message_row_id"]].media = True
if os.path.isfile(file_path):
data[content["raw_string"]].messages[content["message_row_id"]].data = file_path
if content["mime_type"] is None:
guess = mime.guess_type(file_path)[0]
if guess is not None:
data[content["raw_string"]].messages[content["message_row_id"]].mime = guess
else:
data[content["raw_string"]].messages[content["message_row_id"]].mime = "data/data"
else:
data[content["raw_string"]].messages[content["message_row_id"]].mime = content["mime_type"]
else:
# if "https://mmg" in content["mime_type"]:
# try:
# r = requests.get(content["message_url"])
# if r.status_code != 200:
# raise RuntimeError()
# except:
# data[content["raw_string"]].messages[content["message_row_id"]].data = "{The media is missing}"
# data[content["raw_string"]].messages[content["message_row_id"]].media = True
# data[content["raw_string"]].messages[content["message_row_id"]].mime = "media"
# else:
data[content["raw_string"]].messages[content["message_row_id"]].data = "The media is missing"
data[content["raw_string"]].messages[content["message_row_id"]].mime = "media"
data[content["raw_string"]].messages[content["message_row_id"]].meta = True
i += 1
if i % 100 == 0:
print(f"Gathering media...({i}/{total_row_number})", end="\r")
content = c.fetchone()
print(
f"Gathering media...({total_row_number}/{total_row_number})", end="\r")
def vcard(db, data):
c = db.cursor()
c.execute("""SELECT message_row_id,
jid.raw_string,
vcard,
message.text_data
FROM message_vcard
INNER JOIN message
ON message_vcard.message_row_id = message._id
LEFT JOIN chat
ON chat._id = message.chat_row_id
INNER JOIN jid
ON jid._id = chat.jid_row_id
ORDER BY message.chat_row_id ASC;""")
rows = c.fetchall()
total_row_number = len(rows)
print(f"\nGathering vCards...(0/{total_row_number})", end="\r")
base = "WhatsApp/vCards"
if not os.path.isdir(base):
Path(base).mkdir(parents=True, exist_ok=True)
for index, row in enumerate(rows):
media_name = row["text_data"] if row["text_data"] else ""
file_name = "".join(x for x in media_name if x.isalnum())
file_path = f"{base}/{file_name}.vcf"
if not os.path.isfile(file_path):
with open(file_path, "w", encoding="utf-8") as f:
f.write(row["vcard"])
data[row["raw_string"]].messages[row["message_row_id"]].data = media_name + \
"The vCard file cannot be displayed here, " \
f"however it should be located at {file_path}"
data[row["raw_string"]].messages[row["message_row_id"]].mime = "text/x-vcard"
data[row["raw_string"]].messages[row["message_row_id"]].meta = True
print(f"Gathering vCards...({index + 1}/{total_row_number})", end="\r")
def create_html(
data,
output_folder,
template=None,
embedded=False,
offline_static=False,
maximum_size=None
):
if template is None:
template_dir = os.path.dirname(__file__)
template_file = "whatsapp.html"
else:
template_dir = os.path.dirname(template)
template_file = os.path.basename(template)
templateLoader = jinja2.FileSystemLoader(searchpath=template_dir)
templateEnv = jinja2.Environment(loader=templateLoader)
templateEnv.globals.update(determine_day=determine_day)
templateEnv.filters['sanitize_except'] = sanitize_except
template = templateEnv.get_template(template_file)
total_row_number = len(data)
print(f"\nCreating HTML...(0/{total_row_number})", end="\r")
if not os.path.isdir(output_folder):
os.mkdir(output_folder)
w3css = "https://www.w3schools.com/w3css/4/w3.css"
if offline_static:
import urllib.request
static_folder = os.path.join(output_folder, offline_static)
if not os.path.isdir(static_folder):
os.mkdir(static_folder)
w3css_path = os.path.join(static_folder, "w3.css")
if not os.path.isfile(w3css_path):
with urllib.request.urlopen(w3css) as resp:
with open(w3css_path, "wb") as f: f.write(resp.read())
w3css = os.path.join(offline_static, "w3.css")
for current, contact in enumerate(data):
if len(data[contact].messages) == 0:
continue
phone_number = contact.split('@')[0]
if "-" in contact:
file_name = ""
else:
file_name = phone_number
if data[contact].name is not None:
if file_name != "":
file_name += "-"
file_name += data[contact].name.replace("/", "-")
name = data[contact].name
else:
name = phone_number
safe_file_name = "".join(x for x in file_name if x.isalnum() or x in "- ")
with open(f"{output_folder}/{safe_file_name}.html", "w", encoding="utf-8") as f:
f.write(
template.render(
name=name,
msgs=data[contact].messages.values(),
my_avatar=None,
their_avatar=f"WhatsApp/Avatars/{contact}.j",
w3css=w3css
)
)
if current % 10 == 0:
print(f"Creating HTML...({current}/{total_row_number})", end="\r")
print(f"Creating HTML...({total_row_number}/{total_row_number})", end="\r")
if __name__ == "__main__":
from optparse import OptionParser
parser = OptionParser()
parser.add_option(
"-w",
"--wa",
dest="wa",
default="wa.db",
help="Path to contact database")
parser.add_option(
"-m",
"--media",
dest="media",
default="WhatsApp",
help="Path to WhatsApp media folder"
)
# parser.add_option(
# "-t",
# "--template",
# dest="html",
# default="wa.db",
# help="Path to HTML template")
(options, args) = parser.parse_args()
msg_db = "msgstore.db"
output_folder = "temp"
contact_db = options.wa
media_folder = options.media
if len(args) == 1:
msg_db = args[0]
elif len(args) == 2:
msg_db = args[0]
output_folder = args[1]
data = {}
if os.path.isfile(contact_db):
with sqlite3.connect(contact_db) as db:
contacts(db, data)
if os.path.isfile(msg_db):
with sqlite3.connect(msg_db) as db:
messages(db, data)
media(db, data, media_folder)
vcard(db, data)
create_html(data, output_folder)
if not os.path.isdir(f"{output_folder}/WhatsApp"):
shutil.move(media_folder, f"{output_folder}/")
with open("result.json", "w") as f:
data = json.dumps(data)
print(f"\nWriting JSON file...({int(len(data)/1024/1024)}MB)")
f.write(data)
print("Everything is done!")

View File

@@ -0,0 +1,74 @@
from bleach import clean as sanitize
from markupsafe import Markup
from datetime import datetime
from enum import Enum
def sanitize_except(html):
return Markup(sanitize(html, tags=["br"]))
def determine_day(last, current):
last = datetime.fromtimestamp(last).date()
current = datetime.fromtimestamp(current).date()
if last == current:
return None
else:
return current
# Android Specific
CRYPT14_OFFSETS = (
{"iv": 67, "db": 191},
{"iv": 67, "db": 190},
{"iv": 66, "db": 99},
{"iv": 67, "db": 193}
)
class Crypt(Enum):
CRYPT15 = 15
CRYPT14 = 14
CRYPT12 = 12
def brute_force_offset(max_iv=200, max_db=200):
for iv in range(0, max_iv):
for db in range(0, max_db):
yield iv, iv + 16, db
def check_update():
import urllib.request
import json
from sys import platform
from .__init__ import __version__
package_url_json = "https://pypi.org/pypi/whatsapp-chat-exporter/json"
try:
raw = urllib.request.urlopen(package_url_json)
except Exception:
print("Failed to check for updates.")
return 1
else:
with raw:
package_info = json.load(raw)
latest_version = tuple(map(int, package_info["info"]["version"].split(".")))
current_version = tuple(map(int, __version__.split(".")))
if current_version < latest_version:
print("===============Update===============")
print("A newer version of WhatsApp Chat Exporter is available.")
print("Current version: " + __version__)
print("Latest version: " + package_info["info"]["version"])
if platform == "win32":
print("Update with: pip install --upgrade whatsapp-chat-exporter")
else:
print("Update with: pip3 install --upgrade whatsapp-chat-exporter")
print("====================================")
else:
print("You are using the latest version of WhatsApp Chat Exporter.")
return 0
# iOS Specific
APPLE_TIME = datetime.timestamp(datetime(2001, 1, 1))

View File

@@ -1,158 +1,173 @@
<!DOCTYPE html>
<html>
<head>
<title>Whatsapp - {{ name }}</title>
<link rel="stylesheet" href="https://www.w3schools.com/w3css/4/w3.css">
<style>
@import url('https://fonts.googleapis.com/css2?family=Noto+Sans+HK:wght@300;400&display=swap');
html {
font-family: 'Noto Sans HK', sans-serif;
font-size: 12px;
scroll-behavior: smooth;
}
header {
position: fixed;
z-index: 20;
border-bottom: 2px solid #e3e6e7;
font-size: 2em;
font-weight: bolder;
background-color: white;
padding: 20px 0 20px 0;
}
footer {
border-top: 2px solid #e3e6e7;
font-size: 2em;
padding: 20px 0 20px 0;
}
article {
width:500px;
margin:100px auto;
z-index:10;
font-size: 15px;
word-wrap: break-word;
}
img, video {
max-width:100%;
}
a.anchor {
display: block;
position: relative;
top: -100px;
visibility: hidden;
}
div.reply{
font-size: 13px;
text-decoration: none;
}
</style>
</head>
<body>
<header class="w3-center w3-top">Chat history with {{ name }}</header>
<article class="w3-container">
<div class="table" style="width:100%">
{% set last = {'last': 946688461.001} %}
{% for msg in msgs -%}
<div class="w3-row" style="padding-bottom: 10px">
<a class="anchor" id="{{ msg.key_id }}"></a>
{% if determine_day(last.last, msg.timestamp) is not none %}
<div class="w3-center" style="color:#70777c;padding: 10px 0 10px 0;">{{ determine_day(last.last, msg.timestamp) }}</div>
{% if last.update({'last': msg.timestamp}) %}{% endif %}
{% endif %}
{% if msg.from_me == true %}
<div class="w3-row">
<div style="float: left; color:#70777c;">{{ msg.time }}</div>
<div style="padding-left: 10px; text-align: right; color: #3892da;">You</div>
</div>
<div class="w3-row">
<div class="w3-col m10 l10">
<div style="text-align: right;">
{% if msg.reply is not none %}
<div class="reply">
<span style="color: #70777a;">Replying to </span>
<a href="#{{msg.reply}}" style="color: #168acc;">"{{ msg.quoted_data or 'media' }}"</a>
</div>
{% endif %}
{% if msg.media == false %}
{% filter escape %}{{ msg.data or "{This message is not supported yet}" | replace('\n', '<br>') }}{% endfilter %}
{% else %}
{% if "image/" in msg.mime %}
<a href="{{ msg.data }}"><img src="{{ msg.data }}" /></a>
{% elif "audio/" in msg.mime %}
<audio controls="controls" autobuffer="autobuffer">
<source src="{{ msg.data }}" />
</audio>
{% elif "video/" in msg.mime %}
<video controls="controls" autobuffer="autobuffer">
<source src="{{ msg.data }}" />
</video>
{% elif "/" in msg.mime %}
{The file cannot be displayed here, however it should be located at {{ msg.data }}}
{% else %}
{% filter escape %}{{ msg.data }}{% endfilter %}
{% endif %}
{% if msg.caption is not none %}
<br>
{{ msg.caption }}
{% endif %}
{% endif %}
</div>
</div>
<div class="w3-col m2 l2" style="padding-left: 10px"><img src="{{ my_avatar }}" onerror="this.style.display='none'"></div>
</div>
{% else %}
<div class="w3-row">
<div style="padding-right: 10px; float: left; color: #3892da;">
{% if msg.sender is not none %}
{{ msg.sender }}
{% else %}
{{ name }}
{% endif %}
</div>
<div style="text-align: right; color:#70777c;">{{ msg.time }}</div>
</div>
<div class="w3-row">
<div class="w3-col m2 l2"><img src="{{ their_avatar }}" onerror="this.style.display='none'"></div>
<div class="w3-col m10 l10">
<div style="text-align: left;">
{% if msg.reply is not none %}
<div class="reply">
<span style="color: #70777a;">Replying to </span>
<a href="#{{msg.reply}}" style="color: #168acc;">"{{ msg.quoted_data or 'media' }}"</a>
</div>
{% endif %}
{% if msg.media == false %}
{% filter escape %}{{ msg.data or "{This message is not supported yet}" }}{% endfilter %}
{% else %}
{% if "image/" in msg.mime %}
<a href="{{ msg.data }}"><img src="{{ msg.data }}" /></a>
{% elif "audio/" in msg.mime %}
<audio controls="controls" autobuffer="autobuffer">
<source src="{{ msg.data }}" />
</audio>
{% elif "video/" in msg.mime %}
<video controls="controls" autobuffer="autobuffer">
<source src="{{ msg.data }}" />
</video>
{% elif "/" in msg.mime %}
{The file cannot be displayed here, however it should be located at {{ msg.data }}}
{% else %}
{% filter escape %}{{ msg.data }}{% endfilter %}
{% endif %}
{% if msg.caption is not none %}
<br>
{{ msg.caption }}
{% endif %}
{% endif %}
</div>
</div>
</div>
{% endif %}
</div>
{% endfor %}
</div>
</article>
<footer class="w3-center">
End of history
</footer>
</body>
<!DOCTYPE html>
<html>
<head>
<title>Whatsapp - {{ name }}</title>
<meta charset="UTF-8">
<link rel="stylesheet" href="{{w3css}}">
<style>
html, body {
font-size: 12px;
scroll-behavior: smooth;
}
header {
position: fixed;
z-index: 20;
border-bottom: 2px solid #e3e6e7;
font-size: 2em;
font-weight: bolder;
background-color: white;
padding: 20px 0 20px 0;
}
footer {
border-top: 2px solid #e3e6e7;
font-size: 2em;
padding: 20px 0 20px 0;
}
article {
width:500px;
margin:100px auto;
z-index:10;
font-size: 15px;
word-wrap: break-word;
}
img, video {
max-width:100%;
}
a.anchor {
display: block;
position: relative;
top: -100px;
visibility: hidden;
}
div.reply{
font-size: 13px;
text-decoration: none;
}
</style>
</head>
<body>
<header class="w3-center w3-top">Chat history with {{ name }}</header>
<article class="w3-container">
<div class="table" style="width:100%">
{% set last = {'last': 946688461.001} %}
{% for msg in msgs -%}
<div class="w3-row" style="padding-bottom: 10px">
<a class="anchor" id="{{ msg.key_id }}"></a>
{% if determine_day(last.last, msg.timestamp) is not none %}
<div class="w3-center" style="color:#70777c;padding: 10px 0 10px 0;">{{ determine_day(last.last, msg.timestamp) }}</div>
{% if last.update({'last': msg.timestamp}) %}{% endif %}
{% endif %}
{% if msg.from_me == true %}
<div class="w3-row">
<div style="float: left; color:#70777c;">{{ msg.time }}</div>
<div style="padding-left: 10px; text-align: right; color: #3892da;">You</div>
</div>
<div class="w3-row">
<div class="w3-col m10 l10">
<div style="text-align: right;">
{% if msg.reply is not none %}
<div class="reply">
<span style="color: #70777a;">Replying to </span>
<a href="#{{msg.reply}}" style="color: #168acc;">"{{ msg.quoted_data or 'media' }}"</a>
</div>
{% endif %}
{% if msg.meta == true or msg.media == false and msg.data is none %}
<div style="text-align: center;" class="w3-panel w3-border-blue w3-pale-blue w3-rightbar w3-leftbar">
<p>{{ msg.data or 'This message is not supported' }}</p>
</div>
{% else %}
{% if msg.media == false %}
{{ msg.data | sanitize_except() }}
{% else %}
{% if "image/" in msg.mime %}
<a href="{{ msg.data }}"><img src="{{ msg.data }}" /></a>
{% elif "audio/" in msg.mime %}
<audio controls="controls" autobuffer="autobuffer">
<source src="{{ msg.data }}" />
</audio>
{% elif "video/" in msg.mime %}
<video controls="controls" autobuffer="autobuffer">
<source src="{{ msg.data }}" />
</video>
{% elif "/" in msg.mime %}
<div style="text-align: center;" class="w3-panel w3-border-blue w3-pale-blue w3-rightbar w3-leftbar">
<p>The file cannot be displayed here, however it should be located at <a href="./{{ msg.data }}">here</a></p>
</div>
{% else %}
{% filter escape %}{{ msg.data }}{% endfilter %}
{% endif %}
{% if msg.caption is not none %}
<br>
{{ msg.caption }}
{% endif %}
{% endif %}
{% endif %}
</div>
</div>
<div class="w3-col m2 l2" style="padding-left: 10px"><img src="{{ my_avatar }}" onerror="this.style.display='none'"></div>
</div>
{% else %}
<div class="w3-row">
<div style="padding-right: 10px; float: left; color: #3892da;">
{% if msg.sender is not none %}
{{ msg.sender }}
{% else %}
{{ name }}
{% endif %}
</div>
<div style="text-align: right; color:#70777c;">{{ msg.time }}</div>
</div>
<div class="w3-row">
<div class="w3-col m2 l2"><img src="{{ their_avatar }}" onerror="this.style.display='none'"></div>
<div class="w3-col m10 l10">
<div style="text-align: left;">
{% if msg.reply is not none %}
<div class="reply">
<span style="color: #70777a;">Replying to </span>
<a href="#{{msg.reply}}" style="color: #168acc;">"{{ msg.quoted_data or 'media' }}"</a>
</div>
{% endif %}
{% if msg.meta == true or msg.media == false and msg.data is none %}
<div style="text-align: center;" class="w3-panel w3-border-blue w3-pale-blue w3-rightbar w3-leftbar">
<p>{{ msg.data or 'This message is not supported' }}</p>
</div>
{% else %}
{% if msg.media == false %}
{{ msg.data | sanitize_except() }}
{% else %}
{% if "image/" in msg.mime %}
<a href="{{ msg.data }}"><img src="{{ msg.data }}" /></a>
{% elif "audio/" in msg.mime %}
<audio controls preload="auto">
<source src="{{ msg.data }}" />
</audio>
{% elif "video/" in msg.mime %}
<video controls preload="auto">
<source src="{{ msg.data }}" />
</video>
{% elif "/" in msg.mime %}
<div style="text-align: center;" class="w3-panel w3-border-blue w3-pale-blue w3-rightbar w3-leftbar">
<p>The file cannot be displayed here, however it should be located at <a href="./{{ msg.data }}">here</a></p>
</div>
{% else %}
{% filter escape %}{{ msg.data }}{% endfilter %}
{% endif %}
{% if msg.caption is not none %}
<br>
{{ msg.caption }}
{% endif %}
{% endif %}
{% endif %}
</div>
</div>
</div>
{% endif %}
</div>
{% endfor %}
</div>
</article>
<footer class="w3-center">
End of history
</footer>
</body>
</html>

View File

@@ -1,356 +0,0 @@
#!/usr/bin/python3
import sqlite3
import json
import jinja2
import os
import requests
import shutil
import re
from datetime import datetime
from mimetypes import MimeTypes
def determine_day(last, current):
last = datetime.fromtimestamp(last).date()
current = datetime.fromtimestamp(current).date()
if last == current:
return None
else:
return current
def contacts(db, data):
# Get contacts
c = db.cursor()
c.execute("""SELECT count() FROM wa_contacts""")
total_row_number = c.fetchone()[0]
print(f"Gathering contacts...({total_row_number})")
c.execute("""SELECT jid, display_name FROM wa_contacts; """)
row = c.fetchone()
while row is not None:
data[row[0]] = {"name": row[1], "messages": {}}
row = c.fetchone()
def messages(db, data):
# Get message history
c = db.cursor()
c.execute("""SELECT count() FROM messages""")
total_row_number = c.fetchone()[0]
print(f"Gathering messages...(0/{total_row_number})", end="\r")
phone_number_re = re.compile(r"[0-9]+@s.whatsapp.net")
c.execute("""SELECT messages.key_remote_jid,
messages._id,
messages.key_from_me,
messages.timestamp,
messages.data,
messages.status,
messages.edit_version,
messages.thumb_image,
messages.remote_resource,
messages.media_wa_type,
messages.latitude,
messages.longitude,
messages_quotes.key_id as quoted,
messages.key_id,
messages_quotes.data,
messages.media_caption
FROM messages
LEFT JOIN messages_quotes
ON messages.quoted_row_id = messages_quotes._id;""")
i = 0
content = c.fetchone()
while content is not None:
if content[0] not in data:
data[content[0]] = {"name": None, "messages": {}}
data[content[0]]["messages"][content[1]] = {
"from_me": bool(content[2]),
"timestamp": content[3]/1000,
"time": datetime.fromtimestamp(content[3]/1000).strftime("%H:%M"),
"media": False,
"key_id": content[13]
}
if "-" in content[0] and content[2] == 0:
name = None
if content[8] in data:
name = data[content[8]]["name"]
if "@" in content[8]:
fallback = content[8].split('@')[0]
else:
fallback = None
else:
fallback = None
data[content[0]]["messages"][content[1]]["sender"] = name or fallback
else:
data[content[0]]["messages"][content[1]]["sender"] = None
if content[12] is not None:
data[content[0]]["messages"][content[1]]["reply"] = content[12]
data[content[0]]["messages"][content[1]]["quoted_data"] = content[14]
else:
data[content[0]]["messages"][content[1]]["reply"] = None
if content[15] is not None:
data[content[0]]["messages"][content[1]]["caption"] = content[15]
else:
data[content[0]]["messages"][content[1]]["caption"] = None
if content[5] == 6:
if "-" in content[0]:
# Is Group
if content[4] is not None:
try:
int(content[4])
except ValueError:
msg = "{The group name changed to "f"{content[4]}"" }"
data[content[0]]["messages"][content[1]]["data"] = msg
else:
del data[content[0]]["messages"][content[1]]
else:
thumb_image = content[7]
if thumb_image is not None:
if b"\x00\x00\x01\x74\x00\x1A" in thumb_image:
# Add user
added = phone_number_re.search(
thumb_image.decode("unicode_escape"))[0]
if added in data:
name_right = data[added]["name"]
else:
name_right = added.split('@')[0]
if content[8] is not None:
if content[8] in data:
name_left = data[content[8]]["name"]
else:
name_left = content[8].split('@')[0]
msg = "{"f"{name_left}"f" added {name_right or 'You'}""}"
else:
msg = "{"f"Added {name_right or 'You'}""}"
elif b"\xac\xed\x00\x05\x74\x00" in thumb_image:
# Changed number
original = content[8].split('@')[0]
changed = thumb_image[7:].decode().split('@')[0]
msg = "{"f"{original} changed to {changed}""}"
data[content[0]]["messages"][content[1]]["data"] = msg
else:
if content[4] is None:
del data[content[0]]["messages"][content[1]]
else:
# Private chat
if content[4] is None and content[7] is None:
del data[content[0]]["messages"][content[1]]
else:
if content[2] == 1:
if content[5] == 5 and content[6] == 7:
msg = "{Message deleted}"
else:
if content[9] == "5":
msg = "{ Location shared: "f"{content[10], content[11]}"" }"
else:
msg = content[4]
else:
if content[5] == 0 and content[6] == 7:
msg = "{Message deleted}"
else:
if content[9] == "5":
msg = "{ Location shared: "f"{content[10], content[11]}"" }"
else:
msg = content[4]
data[content[0]]["messages"][content[1]]["data"] = msg
i += 1
if i % 1000 == 0:
print(f"Gathering messages...({i}/{total_row_number})", end="\r")
content = c.fetchone()
print(
f"Gathering messages...({total_row_number}/{total_row_number})", end="\r")
def media(db, data, media_folder):
# Get media
c = db.cursor()
c.execute("""SELECT count() FROM message_media""")
total_row_number = c.fetchone()[0]
print(f"\nGathering media...(0/{total_row_number})", end="\r")
i = 0
c.execute("""SELECT messages.key_remote_jid,
message_row_id,
file_path,
message_url,
mime_type,
media_key
FROM message_media
INNER JOIN messages
ON message_media.message_row_id = messages._id
ORDER BY messages.key_remote_jid ASC""")
content = c.fetchone()
mime = MimeTypes()
while content is not None:
file_path = f"{media_folder}/{content[2]}"
data[content[0]]["messages"][content[1]]["media"] = True
if os.path.isfile(file_path):
data[content[0]]["messages"][content[1]]["data"] = file_path
if content[4] is None:
guess = mime.guess_type(file_path)[0]
if guess is not None:
data[content[0]]["messages"][content[1]]["mime"] = guess
else:
data[content[0]]["messages"][content[1]]["mime"] = "data/data"
else:
data[content[0]]["messages"][content[1]]["mime"] = content[4]
else:
# if "https://mmg" in content[4]:
# try:
# r = requests.get(content[3])
# if r.status_code != 200:
# raise RuntimeError()
# except:
# data[content[0]]["messages"][content[1]]["data"] = "{The media is missing}"
# data[content[0]]["messages"][content[1]]["media"] = True
# data[content[0]]["messages"][content[1]]["mime"] = "media"
# else:
data[content[0]]["messages"][content[1]]["data"] = "{The media is missing}"
data[content[0]]["messages"][content[1]]["mime"] = "media"
i += 1
if i % 100 == 0:
print(f"Gathering media...({i}/{total_row_number})", end="\r")
content = c.fetchone()
print(
f"Gathering media...({total_row_number}/{total_row_number})", end="\r")
def vcard(db, data):
c = db.cursor()
c.execute("""SELECT message_row_id,
messages.key_remote_jid,
vcard,
messages.media_name
FROM messages_vcards
INNER JOIN messages
ON messages_vcards.message_row_id = messages._id
ORDER BY messages.key_remote_jid ASC;""")
rows = c.fetchall()
total_row_number = len(rows)
print(f"\nGathering vCards...(0/{total_row_number})", end="\r")
base = "WhatsApp/vCards"
for index, row in enumerate(rows):
if not os.path.isdir(base):
os.mkdir(base)
file_name = "".join(x for x in row[3] if x.isalnum())
file_path = f"{base}/{file_name}.vcf"
if not os.path.isfile(file_path):
with open(file_path, "w", encoding="utf-8") as f:
f.write(row[2])
data[row[1]]["messages"][row[0]]["data"] = row[3] + \
"{ The vCard file cannot be displayed here, however it " \
"should be located at " + file_path + "}"
data[row[1]]["messages"][row[0]]["mime"] = "text/x-vcard"
print(f"Gathering vCards...({index + 1}/{total_row_number})", end="\r")
def create_html(data, output_folder):
templateLoader = jinja2.FileSystemLoader(searchpath="./")
templateEnv = jinja2.Environment(loader=templateLoader)
templateEnv.globals.update(determine_day=determine_day)
TEMPLATE_FILE = "whatsapp.html"
template = templateEnv.get_template(TEMPLATE_FILE)
total_row_number = len(data)
print(f"\nCreating HTML...(0/{total_row_number})", end="\r")
if not os.path.isdir(output_folder):
os.mkdir(output_folder)
for current, contact in enumerate(data):
if len(data[contact]["messages"]) == 0:
continue
phone_number = contact.split('@')[0]
if "-" in contact:
file_name = ""
else:
file_name = phone_number
if data[contact]["name"] is not None:
if file_name != "":
file_name += "-"
file_name += data[contact]["name"].replace("/", "-")
name = data[contact]["name"]
else:
name = phone_number
safe_file_name = ''
safe_file_name = "".join(x for x in file_name if x.isalnum() or x in "- ")
with open(f"{output_folder}/{safe_file_name}.html", "w", encoding="utf-8") as f:
f.write(
template.render(
name=name,
msgs=data[contact]["messages"].values(),
my_avatar=None,
their_avatar=f"WhatsApp/Avatars/{contact}.j"
)
)
if current % 10 == 0:
print(f"Creating HTML...({current}/{total_row_number})", end="\r")
print(f"Creating HTML...({total_row_number}/{total_row_number})", end="\r")
if __name__ == "__main__":
from optparse import OptionParser
parser = OptionParser()
parser.add_option(
"-w",
"--wa",
dest="wa",
default="wa.db",
help="Path to contact database")
parser.add_option(
"-m",
"--media",
dest="media",
default="WhatsApp",
help="Path to WhatsApp media folder"
)
# parser.add_option(
# "-t",
# "--template",
# dest="html",
# default="wa.db",
# help="Path to HTML template")
(options, args) = parser.parse_args()
msg_db = "msgstore.db"
output_folder = "temp"
contact_db = options.wa
media_folder = options.media
if len(args) == 1:
msg_db = args[0]
elif len(args) == 2:
msg_db = args[0]
output_folder = args[1]
data = {}
if os.path.isfile(contact_db):
with sqlite3.connect(contact_db) as db:
contacts(db, data)
if os.path.isfile(msg_db):
with sqlite3.connect(msg_db) as db:
messages(db, data)
media(db, data, media_folder)
vcard(db, data)
create_html(data, output_folder)
if not os.path.isdir(f"{output_folder}/WhatsApp"):
shutil.move(media_folder, f"{output_folder}/")
with open("result.json", "w") as f:
data = json.dumps(data)
print(f"\nWriting JSON file...({int(len(data)/1024/1024)}MB)")
f.write(data)
print("Everything is done!")

View File

@@ -1,50 +0,0 @@
#!/usr/bin/python3
import shutil
import sqlite3
import os
def extract_media(base_dir):
with sqlite3.connect(f"{base_dir}/Manifest.db") as manifest:
c = manifest.cursor()
c.execute("""SELECT count()
FROM Files
WHERE relativePath
LIKE 'Message/Media/%'""")
total_row_number = c.fetchone()[0]
print(f"Gathering media...(0/{total_row_number})", end="\r")
c.execute("""SELECT fileID,
relativePath,
flags
FROM Files
WHERE relativePath
LIKE 'Message/Media/%'""")
row = c.fetchone()
if not os.path.isdir("Message"):
os.mkdir("Message")
if not os.path.isdir("Message/Media"):
os.mkdir("Message/Media")
i = 0
while row is not None:
destination = row[1]
hashes = row[0]
folder = hashes[:2]
flags = row[2]
if flags == 2:
os.mkdir(destination)
elif flags == 1:
shutil.copyfile(f"{base_dir}/{folder}/{hashes}", destination)
i += 1
if i % 100 == 0:
print(f"Gathering media...({i}/{total_row_number})", end="\r")
row = c.fetchone()
print(f"Gathering media...({total_row_number}/{total_row_number})", end="\r")
if __name__ == "__main__":
from optparse import OptionParser
parser = OptionParser()
(_, args) = parser.parse_args()
base_dir = args[0]
extract_media(base_dir)

BIN
group.png

Binary file not shown.

Before

Width:  |  Height:  |  Size: 29 KiB

BIN
imgs/android_structure.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

BIN
imgs/group.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 36 KiB

BIN
imgs/pm.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 38 KiB

View File

Before

Width:  |  Height:  |  Size: 7.8 KiB

After

Width:  |  Height:  |  Size: 7.8 KiB

BIN
pm.png

Binary file not shown.

Before

Width:  |  Height:  |  Size: 32 KiB

63
setup.py Normal file
View File

@@ -0,0 +1,63 @@
import setuptools
from re import search
with open("README.md", "r") as fh:
long_description = fh.read()
with open("Whatsapp_Chat_Exporter/__init__.py", encoding="utf8") as f:
version = search(r'__version__ = "(.*?)"', f.read()).group(1)
setuptools.setup(
name="whatsapp-chat-exporter",
version=version,
author="KnugiHK",
author_email="hello@knugi.com",
description="A Whatsapp database parser that will give you the "
"history of your Whatsapp conversations in HTML and JSON.",
long_description=long_description,
long_description_content_type="text/markdown",
license="MIT",
keywords=[
"android", "ios", "parsing", "history","iphone", "whatsapp", "message"
"customizable", "android-backup", "crypt12", "whatsapp-chat-exporter",
"whatsapp-export", "whatsapp-database", "whatsapp-database-parser",
"whatsapp-conversations", "iphone-backup", "crypt14", "crypt15", "messages"
],
platforms=["any"],
url="https://github.com/KnugiHK/Whatsapp-Chat-Exporter",
packages=setuptools.find_packages(),
package_data={
'': ['whatsapp.html']
},
classifiers=[
"Programming Language :: Python :: 3 :: Only",
"Programming Language :: Python :: 3.7",
"Programming Language :: Python :: 3.8",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"License :: OSI Approved :: MIT License",
"Operating System :: OS Independent",
"Development Status :: 4 - Beta",
"Environment :: Console",
"Intended Audience :: End Users/Desktop",
"Topic :: Communications :: Chat",
"Topic :: Utilities",
"Topic :: Database"
],
python_requires='>=3.7',
install_requires=[
'jinja2',
'bleach'
],
extras_require={
'android_backup': ["pycryptodome", "javaobj-py3"],
'crypt12': ["pycryptodome"],
'crypt14': ["pycryptodome"],
'crypt15': ["pycryptodome", "javaobj-py3"]
},
entry_points={
"console_scripts": [
"wtsexporter = Whatsapp_Chat_Exporter.__main__:main"
]
}
)