51 Commits

Author SHA1 Message Date
KnugiHK
2ca064d111 Revert back to 3.10 as 3.11 and 3.12 failed 2024-06-08 18:24:05 +08:00
KnugiHK
3b54ca9d28 Change to use OIDC for publishing to PyPi 2024-06-08 18:20:44 +08:00
KnugiHK
03312da6ee Update Python to 12 for Nuitka 2024-06-08 17:57:48 +08:00
KnugiHK
c7e8a603c7 Undefined vCard File 2024-06-08 17:51:44 +08:00
KnugiHK
574b0393d8 Align vCard UX with iOS 2024-06-08 17:50:48 +08:00
KnugiHK
baa79a7b74 Merge branch 'pr/99' into dev 2024-06-08 17:38:51 +08:00
KnugiHK
d57ff29e71 Add link to vcard entry 2024-06-08 17:29:10 +08:00
jonx
2d4d934a91 Handle groups of VCards correctly 2024-05-03 02:36:17 +02:00
Knugi
9741cab078 Merge pull request #93 from mmmeeedddsss/main
Add support for separating media files per chat
2024-04-26 21:38:37 +08:00
KnugiHK
1e7687f8e8 Update the help text for --create-separated-media 2024-04-21 12:35:09 +08:00
KnugiHK
524b3a4034 Implement separate media for iOS also 2024-04-21 12:33:03 +08:00
KnugiHK
1ab4b24fa0 Fix typo 2024-04-21 12:03:41 +08:00
KnugiHK
8d003b217c Refactor a bit and use chat jid as the final fallback 2024-04-21 12:00:25 +08:00
KnugiHK
d754e6c279 Add Django license 2024-04-21 11:37:05 +08:00
Mert Tunc
0eebbcff21 Add support for separating media files per chat 2024-04-15 19:20:33 +03:00
KnugiHK
a569fb0875 Change exit to argparse error 2024-02-24 16:41:33 +08:00
KnugiHK
6e8e0d7f59 Implement per chat json #86 2024-02-24 16:26:15 +08:00
KnugiHK
c0a511adb3 Refactor imports 2024-02-13 16:01:21 +08:00
KnugiHK
e84640de1c Update description 2024-02-13 15:59:14 +08:00
KnugiHK
20199ed794 Rename files and names 2024-02-13 15:58:29 +08:00
KnugiHK
f4e610a953 Rename variable of the date filter in processes 2024-02-13 15:37:06 +08:00
KnugiHK
99a3a4bcd0 Refactor chat condition 2024-02-13 15:33:58 +08:00
KnugiHK
dedfce8feb Improve help menu 2024-02-13 15:11:53 +08:00
KnugiHK
54e0b43888 Improve redirection 2024-02-13 15:03:51 +08:00
KnugiHK
d5ea843286 Add chat filter 2024-02-13 14:52:15 +08:00
KnugiHK
b01fe0ab4a Bug fix on missing table join 2024-02-13 13:57:22 +08:00
KnugiHK
a7ccc3be66 Merge branch 'main' into dev 2024-02-12 18:08:59 +08:00
KnugiHK
07b1cf6a8a Wrong place 2024-02-12 18:08:45 +08:00
KnugiHK
2b49ac2e41 Add date filter for iOS #82 2024-02-12 18:00:26 +08:00
KnugiHK
2466e2542a Add date filter for Android #82 2024-02-12 17:31:29 +08:00
Knugi
af70f6f6f9 Update CNAME 2024-02-12 09:27:09 +00:00
KnugiHK
48c3fa965f Bug fix on wrong table name for old schema 2023-12-29 18:32:38 +08:00
KnugiHK
472c18448c Update the Python version for standalone binaries to 3.11 2023-12-17 14:53:18 +08:00
KnugiHK
810d8c7c8b Revert "Add macos-13-arm64"
This reverts commit f80be81ee6.
2023-12-17 14:53:09 +08:00
KnugiHK
f80be81ee6 Add macos-13-arm64 2023-12-17 14:43:20 +08:00
KnugiHK
0fcaa946e6 Add --no-deployment-flag=self-execution flag for -m option in binary 2023-12-17 14:40:47 +08:00
KnugiHK
1e7953e5fe Update .gitignore 2023-12-17 14:40:27 +08:00
KnugiHK
481656fdeb Bug fix for the different between key file and hex key 2023-12-17 14:40:08 +08:00
KnugiHK
3d155fb48f Bug fix on wrong timestamp variable 2023-12-17 14:37:59 +08:00
KnugiHK
f659a8c171 Make sure all messages are extracted chronologically #64 2023-12-07 23:17:50 +08:00
KnugiHK
3ffb63ed28 Update readme 2023-12-03 20:41:34 +08:00
KnugiHK
94956913e8 Add time zone offset to display time 2023-12-03 20:35:58 +08:00
KnugiHK
7b5a7419f1 Apply ZCONTACTJID as chat ID in vcard process 2023-12-03 20:21:19 +08:00
KnugiHK
d5cef051d3 Fix an incorrect variable 2023-12-03 20:17:28 +08:00
KnugiHK
f81f31d667 Fix attempt to #64 2023-12-03 14:19:49 +08:00
KnugiHK
8c617b721f Create bruteforce_crypt15.py 2023-12-03 13:49:57 +08:00
KnugiHK
0d626519ec Add support for decrypting contact db from crypt15 backup 2023-12-03 13:40:21 +08:00
Knugi
f39d448aa6 Update issue templates 2023-12-03 03:19:00 +00:00
Knugi
2dc433df7c Update issue templates 2023-12-03 03:17:41 +00:00
Knugi
75a8a2e8c5 Update issue templates 2023-12-03 03:16:52 +00:00
KnugiHK
3847836ed6 Fix: use a more reliable way to determine chat #61
This fix also changed how to determine if the message belongs to a group or PM.
2023-12-03 10:27:22 +08:00
18 changed files with 655 additions and 149 deletions

35
.github/ISSUE_TEMPLATE/bug_report.md vendored Normal file
View File

@@ -0,0 +1,35 @@
---
name: Bug report
about: Create a report to help us improve
title: "[BUG]"
labels: ''
assignees: ''
---
# Must have
- WhatsApp version: [WhatsApp version] - [Android/iOS]
- Platform: [Linux/Windows/MacOS]
- Branch and version: [main/dev] - [exporter version]
If it is an error yield by Python, please also provide the trackback
```
[trackback here]
```
# Nice to have
**Describe the bug**
A clear and concise description of what the bug is.
**To Reproduce**
Steps to reproduce the behavior:
1. Go to '...'
2. Click on '....'
3. Scroll down to '....'
4. See error
**Screenshots**
If applicable, add screenshots to help explain your problem.
**Additional context**
Add any other context about the problem here.

View File

@@ -0,0 +1,17 @@
---
name: Feature request
about: Suggest an idea for this project
title: "[FEATURE]"
labels: ''
assignees: ''
---
**Is your feature request related to a problem? Please describe.**
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
**Describe the solution you'd like**
A clear and concise description of what you want to happen.
**Additional context**
Add any other context or screenshots about the feature request here.

View File

@@ -16,7 +16,7 @@ jobs:
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.10'
python-version: '3.11'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
@@ -24,7 +24,7 @@ jobs:
pip install .
- name: Build binary with Nuitka
run: |
python -m nuitka --onefile --include-data-file=./Whatsapp_Chat_Exporter/whatsapp.html=./Whatsapp_Chat_Exporter/whatsapp.html --follow-imports Whatsapp_Chat_Exporter/__main__.py
python -m nuitka --no-deployment-flag=self-execution --onefile --include-data-file=./Whatsapp_Chat_Exporter/whatsapp.html=./Whatsapp_Chat_Exporter/whatsapp.html --follow-imports Whatsapp_Chat_Exporter/__main__.py
cp __main__.bin wtsexporter_linux_x64
sha256sum wtsexporter_linux_x64
- uses: actions/upload-artifact@v3
@@ -40,7 +40,7 @@ jobs:
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.10'
python-version: '3.11'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
@@ -48,7 +48,7 @@ jobs:
pip install .
- name: Build binary with Nuitka
run: |
python -m nuitka --onefile --include-data-file=./Whatsapp_Chat_Exporter/whatsapp.html=./Whatsapp_Chat_Exporter/whatsapp.html --assume-yes-for-downloads --follow-imports Whatsapp_Chat_Exporter\__main__.py
python -m nuitka --no-deployment-flag=self-execution --onefile --include-data-file=./Whatsapp_Chat_Exporter/whatsapp.html=./Whatsapp_Chat_Exporter/whatsapp.html --assume-yes-for-downloads --follow-imports Whatsapp_Chat_Exporter\__main__.py
copy __main__.exe wtsexporter_x64.exe
Get-FileHash wtsexporter_x64.exe
- uses: actions/upload-artifact@v3
@@ -72,7 +72,7 @@ jobs:
pip install .
- name: Build binary with Nuitka
run: |
python -m nuitka --onefile --include-data-file=./Whatsapp_Chat_Exporter/whatsapp.html=./Whatsapp_Chat_Exporter/whatsapp.html --follow-imports Whatsapp_Chat_Exporter/__main__.py
python -m nuitka --no-deployment-flag=self-execution --onefile --include-data-file=./Whatsapp_Chat_Exporter/whatsapp.html=./Whatsapp_Chat_Exporter/whatsapp.html --follow-imports Whatsapp_Chat_Exporter/__main__.py
cp __main__.bin wtsexporter_macos_x64
shasum -a 256 wtsexporter_macos_x64
- uses: actions/upload-artifact@v3

View File

@@ -18,9 +18,9 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v4
uses: actions/setup-python@v5
with:
python-version: '3.x'
- name: Install dependencies
@@ -31,6 +31,3 @@ jobs:
run: python -m build
- name: Publish package
uses: pypa/gh-action-pypi-publish@release/v1
with:
user: __token__
password: ${{ secrets.PYPI_API_TOKEN }}

7
.gitignore vendored
View File

@@ -127,3 +127,10 @@ dmypy.json
# Pyre type checker
.pyre/
# Nuitka
*.build/
*.dist/
*.onefile-build/
*.exe
__main__

2
CNAME
View File

@@ -1 +1 @@
wts.knugi.com
wts.knugi.dev

36
LICENSE.django Normal file
View File

@@ -0,0 +1,36 @@
The Whatsapp Chat Exporter is licensed under the MIT license. For more information,
refer to the file LICENSE.
Whatsapp Chat Exporter incorporates code from Django, governed by the three-clause
BSD license—a permissive open-source license. The copyright and license details are
provided below to adhere to Django's terms.
------
Copyright (c) Django Software Foundation and individual contributors.
All rights reserved.
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
1. Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
3. Neither the name of Django nor the names of its contributors may be used
to endorse or promote products derived from this software without
specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

View File

@@ -114,13 +114,19 @@ After extracting, you will get these:
Invoke the wtsexporter with --help option will show you all options available.
```sh
> wtsexporter --help
usage: wtsexporter [-h] [-a] [-i] [-e EXPORTED] [-w WA] [-m MEDIA] [-b BACKUP] [-o OUTPUT] [-j [JSON]] [-d DB] [-k KEY] [-t TEMPLATE] [-s] [-c] [--offline OFFLINE] [--size [SIZE]]
[--no-html] [--check-update] [--assume-first-as-me]
usage: wtsexporter [-h] [-a] [-i] [-e EXPORTED] [-w WA] [-m MEDIA] [-b BACKUP] [-o OUTPUT] [-j [JSON]] [-d DB]
[-k KEY] [-t TEMPLATE] [-s] [-c] [--offline OFFLINE] [--size [SIZE]] [--no-html] [--check-update]
[--assume-first-as-me] [--no-avatar] [--import] [--business] [--preserve-timestamp] [--wab WAB]
[--time-offset {-12 to 14}] [--date DATE] [--date-format FORMAT] [--include [phone number ...]]
[--exclude [phone number ...]] [--create-separated-media]
A customizable Android and iPhone WhatsApp database parser that will give you the history of your WhatsApp
conversations in HTML and JSON. Android Backup Crypt12, Crypt14 and Crypt15 supported.
options:
-h, --help show this help message and exit
-a, --android Define the target as Android
-i, --iphone, --ios Define the target as iPhone/iPad
-i, --ios, --iphone Define the target as iPhone/iPad
-e EXPORTED, --exported EXPORTED
Define the target as exported chat file and specify the path to the file
-w WA, --wa WA Path to contact database (default: wa.db/ContactsV2.sqlite)
@@ -144,8 +150,25 @@ options:
--no-html Do not output html files
--check-update Check for updates (require Internet access)
--assume-first-as-me Assume the first message in a chat as sent by me (must be used together with -e)
--no-avatar Do not render avatar in HTML output
--import Import JSON file and convert to HTML output
--business Use Whatsapp Business default files (iOS only)
--preserve-timestamp Preserve the modification timestamp of the extracted files (iOS only)
--wab WAB, --wa-backup WAB
Path to contact database in crypt15 format
--time-offset {-12 to 14}
Offset in hours (-12 to 14) for time displayed in the output
--date DATE The date filter in specific format (inclusive)
--date-format FORMAT The date format for the date filter
--include [phone number ...]
Include chats that match the supplied phone number
--exclude [phone number ...]
Exclude chats that match the supplied phone number
--create-separated-media
Create a copy of the media seperated per chat in <MEDIA>/separated/ directory
(Android only)
WhatsApp Chat Exporter: 0.9.7 Licensed with MIT
WhatsApp Chat Exporter: 0.10.0 Licensed with MIT
```
# To do

View File

@@ -1,3 +1,3 @@
#!/usr/bin/python3
__version__ = "0.9.7"
__version__ = "0.10.0"

View File

@@ -1,16 +1,19 @@
#!/usr/bin/python3
import io
import os
import sqlite3
import shutil
import json
import string
import glob
from Whatsapp_Chat_Exporter import extract_exported, extract_iphone
from Whatsapp_Chat_Exporter import extract, extract_iphone_media
from Whatsapp_Chat_Exporter import exported_handler, android_handler
from Whatsapp_Chat_Exporter import ios_handler, ios_media_handler
from Whatsapp_Chat_Exporter.data_model import ChatStore
from Whatsapp_Chat_Exporter.utility import Crypt, check_update, import_from_json
from Whatsapp_Chat_Exporter.utility import APPLE_TIME, Crypt, DbType
from Whatsapp_Chat_Exporter.utility import check_update, import_from_json
from argparse import ArgumentParser, SUPPRESS
from datetime import datetime
from sys import exit
try:
from .__init__ import __version__
@@ -20,10 +23,11 @@ except ImportError:
def main():
parser = ArgumentParser(
description = 'A customizable Android and iPhone WhatsApp database parser that '
description = 'A customizable Android and iOS/iPadOS WhatsApp database parser that '
'will give you the history of your WhatsApp conversations in HTML '
'and JSON. Android Backup Crypt12, Crypt14 and Crypt15 supported.',
epilog = f'WhatsApp Chat Exporter: {__version__} Licensed with MIT'
epilog = f'WhatsApp Chat Exporter: {__version__} Licensed with MIT. See'
'https://wts.knugi.dev/docs?dest=osl for all open source licenses.'
)
parser.add_argument(
'-a',
@@ -34,9 +38,9 @@ def main():
help="Define the target as Android")
parser.add_argument(
'-i',
'--iphone',
'--ios',
dest='iphone',
'--iphone',
dest='ios',
default=False,
action='store_true',
help="Define the target as iPhone/iPad")
@@ -65,7 +69,7 @@ def main():
dest="backup",
default=None,
help="Path to Android (must be used together "
"with -k)/iPhone WhatsApp backup")
"with -k)/iOS WhatsApp backup")
parser.add_argument(
"-o",
"--output",
@@ -191,6 +195,64 @@ def main():
action='store_true',
help="Preserve the modification timestamp of the extracted files (iOS only)"
)
parser.add_argument(
"--wab",
"--wa-backup",
dest="wab",
default=None,
help="Path to contact database in crypt15 format"
)
parser.add_argument(
"--time-offset",
dest="timezone_offset",
default=0,
type=int,
choices=range(-12, 15),
metavar="{-12 to 14}",
help="Offset in hours (-12 to 14) for time displayed in the output"
)
parser.add_argument(
"--date",
dest="filter_date",
default=None,
metavar="DATE",
help="The date filter in specific format (inclusive)"
)
parser.add_argument(
"--date-format",
dest="filter_date_format",
default="%Y-%m-%d %H:%M",
metavar="FORMAT",
help="The date format for the date filter"
)
parser.add_argument(
"--include",
dest="filter_chat_include",
nargs='*',
metavar="phone number",
help="Include chats that match the supplied phone number"
)
parser.add_argument(
"--exclude",
dest="filter_chat_exclude",
nargs='*',
metavar="phone number",
help="Exclude chats that match the supplied phone number"
)
parser.add_argument(
"--per-chat",
dest="json_per_chat",
default=False,
action='store_true',
help="Output the JSON file per chat"
)
parser.add_argument(
"--create-separated-media",
dest="separate_media",
default=False,
action='store_true',
help="Create a copy of the media seperated per chat in <MEDIA>/separated/ directory"
)
args = parser.parse_args()
# Check for updates
@@ -198,37 +260,80 @@ def main():
exit(check_update())
# Sanity checks
if args.android and args.iphone and args.exported and args.import_json:
print("You must define only one device type.")
exit(1)
if not args.android and not args.iphone and not args.exported and not args.import_json:
print("You must define the device type.")
exit(1)
if args.android and args.ios and args.exported and args.import_json:
parser.error("You must define only one device type.")
if not args.android and not args.ios and not args.exported and not args.import_json:
parser.error("You must define the device type.")
if args.no_html and not args.json:
print("You must either specify a JSON output file or enable HTML output.")
exit(1)
if args.import_json and (args.android or args.iphone or args.exported or args.no_html):
print("You can only use --import with -j and without --no-html.")
exit(1)
parser.error("You must either specify a JSON output file or enable HTML output.")
if args.import_json and (args.android or args.ios or args.exported or args.no_html):
parser.error("You can only use --import with -j and without --no-html.")
elif args.import_json and not os.path.isfile(args.json):
print("JSON file not found.")
exit(1)
parser.error("JSON file not found.")
if args.android and args.business:
print("WhatsApp Business is only available on iOS for now.")
exit(1)
parser.error("WhatsApp Business is only available on iOS for now.")
if args.json_per_chat and (
(args.json[-5:] != ".json" and os.path.isfile(args.json)) or \
(args.json[-5:] == ".json" and os.path.isfile(args.json[:-5]))
):
parser.error("When --per-chat is enabled, the destination of --json must be a directory.")
if args.filter_date is not None:
if " - " in args.filter_date:
start, end = args.filter_date.split(" - ")
start = int(datetime.strptime(start, args.filter_date_format).timestamp())
end = int(datetime.strptime(end, args.filter_date_format).timestamp())
if start < 1009843200 or end < 1009843200:
parser.error("WhatsApp was first released in 2009...")
if start > end:
parser.error("The start date cannot be a moment after the end date.")
if args.android:
args.filter_date = f"BETWEEN {start}000 AND {end}000"
elif args.ios:
args.filter_date = f"BETWEEN {start - APPLE_TIME} AND {end - APPLE_TIME}"
else:
_timestamp = int(datetime.strptime(args.filter_date[2:], args.filter_date_format).timestamp())
if _timestamp < 1009843200:
parser.error("WhatsApp was first released in 2009...")
if args.filter_date[:2] == "> ":
if args.android:
args.filter_date = f">= {_timestamp}000"
elif args.ios:
args.filter_date = f">= {_timestamp - APPLE_TIME}"
elif args.filter_date[:2] == "< ":
if args.android:
args.filter_date = f"<= {_timestamp}000"
elif args.ios:
args.filter_date = f"<= {_timestamp - APPLE_TIME}"
else:
parser.error("Unsupported date format. See https://wts.knugi.dev/docs?dest=date")
if args.filter_chat_include is not None and args.filter_chat_exclude is not None:
parser.error("Chat inclusion and exclusion filters cannot be used together.")
if args.filter_chat_include is not None:
for chat in args.filter_chat_include:
if not chat.isnumeric():
parser.error("Enter a phone number in the chat filter. See https://wts.knugi.dev/docs?dest=chat")
if args.filter_chat_exclude is not None:
for chat in args.filter_chat_exclude:
if not chat.isnumeric():
parser.error("Enter a phone number in the chat filter. See https://wts.knugi.dev/docs?dest=chat")
filter_chat = (args.filter_chat_include, args.filter_chat_exclude)
data = {}
if args.android:
contacts = extract.contacts
messages = extract.messages
media = extract.media
vcard = extract.vcard
create_html = extract.create_html
contacts = android_handler.contacts
messages = android_handler.messages
media = android_handler.media
vcard = android_handler.vcard
create_html = android_handler.create_html
if args.db is None:
msg_db = "msgstore.db"
else:
msg_db = args.db
if args.wa is None:
contact_db = "wa.db"
else:
contact_db = args.wa
if args.key is not None:
if args.backup is None:
print("You must specify the backup file with -b")
@@ -245,7 +350,20 @@ def main():
elif all(char in string.hexdigits for char in args.key):
key = bytes.fromhex(args.key)
db = open(args.backup, "rb").read()
error = extract.decrypt_backup(db, key, msg_db, crypt, args.showkey)
if args.wab:
wab = open(args.wab, "rb").read()
error_wa = android_handler.decrypt_backup(wab, key, contact_db, crypt, args.showkey, DbType.CONTACT)
if isinstance(key, io.IOBase):
key.seek(0)
else:
error_wa = 0
error_message = android_handler.decrypt_backup(db, key, msg_db, crypt, args.showkey, DbType.MESSAGE)
if error_wa != 0:
error = error_wa
elif error_message != 0:
error = error_message
else:
error = 0
if error != 0:
if error == 1:
print("Dependencies of decrypt_backup and/or extract_encrypted_key"
@@ -258,10 +376,6 @@ def main():
else:
print("Unknown error occurred.", error)
exit(5)
if args.wa is None:
contact_db = "wa.db"
else:
contact_db = args.wa
if args.media is None:
args.media = "WhatsApp"
@@ -269,18 +383,18 @@ def main():
with sqlite3.connect(contact_db) as db:
db.row_factory = sqlite3.Row
contacts(db, data)
elif args.iphone:
elif args.ios:
import sys
if "--iphone" in sys.argv:
print(
"WARNING: The --iphone flag is deprecated and will"
"be removed in the future. Use --ios instead."
)
contacts = extract_iphone.contacts
messages = extract_iphone.messages
media = extract_iphone.media
vcard = extract_iphone.vcard
create_html = extract.create_html
contacts = ios_handler.contacts
messages = ios_handler.messages
media = ios_handler.media
vcard = ios_handler.vcard
create_html = android_handler.create_html
if args.business:
from Whatsapp_Chat_Exporter.utility import WhatsAppBusinessIdentifier as identifiers
else:
@@ -289,7 +403,7 @@ def main():
args.media = identifiers.DOMAIN
if args.backup is not None:
if not os.path.isdir(args.media):
extract_iphone_media.extract_media(args.backup, identifiers, args.preserve_timestamp)
ios_media_handler.extract_media(args.backup, identifiers, args.preserve_timestamp)
else:
print("WhatsApp directory already exists, skipping WhatsApp file extraction.")
if args.db is None:
@@ -309,11 +423,11 @@ def main():
if os.path.isfile(msg_db):
with sqlite3.connect(msg_db) as db:
db.row_factory = sqlite3.Row
messages(db, data, args.media)
media(db, data, args.media)
vcard(db, data, args.media)
messages(db, data, args.media, args.timezone_offset, args.filter_date, filter_chat)
media(db, data, args.media, args.filter_date, filter_chat, args.separate_media)
vcard(db, data, args.media, args.filter_date, filter_chat)
if args.android:
extract.calls(db, data)
android_handler.calls(db, data, args.timezone_offset, filter_chat)
if not args.no_html:
create_html(
data,
@@ -329,7 +443,7 @@ def main():
"The message database does not exist. You may specify the path "
"to database file with option -d or check your provided path."
)
exit(2)
exit(6)
if os.path.isdir(args.media):
media_path = os.path.join(args.output, args.media)
@@ -349,9 +463,9 @@ def main():
print("\nCannot remove original WhatsApp directory. "
"Perhaps the directory is opened?", end="\n")
elif args.exported:
extract_exported.messages(args.exported, data, args.assume_first_as_me)
exported_handler.messages(args.exported, data, args.assume_first_as_me)
if not args.no_html:
extract.create_html(
android_handler.create_html(
data,
args.output,
args.template,
@@ -363,7 +477,7 @@ def main():
shutil.copy(file, args.output)
elif args.import_json:
import_from_json(args.json, data)
extract.create_html(
android_handler.create_html(
data,
args.output,
args.template,
@@ -375,10 +489,26 @@ def main():
if args.json and not args.import_json:
if isinstance(data[next(iter(data))], ChatStore):
data = {jik: chat.to_json() for jik, chat in data.items()}
with open(args.json, "w") as f:
data = json.dumps(data)
print(f"\nWriting JSON file...({int(len(data)/1024/1024)}MB)")
f.write(data)
if not args.json_per_chat:
with open(args.json, "w") as f:
data = json.dumps(data)
print(f"\nWriting JSON file...({int(len(data)/1024/1024)}MB)")
f.write(data)
else:
if args.json[-5:] == ".json":
args.json = args.json[:-5]
total = len(data.keys())
if not os.path.isdir(args.json):
os.mkdir(args.json)
for index, jik in enumerate(data.keys()):
if data[jik]["name"] is not None:
contact = data[jik]["name"].replace('/', '')
else:
contact = jik.replace('+', '')
with open(f"{args.json}/{contact}.json", "w") as f:
f.write(json.dumps(data[jik]))
print(f"Writing JSON file...({index + 1}/{total})", end="\r")
print()
else:
print()

View File

@@ -1,21 +1,20 @@
#!/usr/bin/python3
import sqlite3
import json
import jinja2
import os
import shutil
import re
import io
import hmac
import shutil
from pathlib import Path
from mimetypes import MimeTypes
from markupsafe import escape as htmle
from hashlib import sha256
from base64 import b64decode, b64encode
from Whatsapp_Chat_Exporter.data_model import ChatStore, Message
from Whatsapp_Chat_Exporter.utility import MAX_SIZE, ROW_SIZE, determine_metadata, get_status_location
from Whatsapp_Chat_Exporter.utility import MAX_SIZE, ROW_SIZE, DbType, determine_metadata, JidType
from Whatsapp_Chat_Exporter.utility import rendering, Crypt, Device, get_file_name, setup_template
from Whatsapp_Chat_Exporter.utility import brute_force_offset, CRYPT14_OFFSETS, JidType
from Whatsapp_Chat_Exporter.utility import brute_force_offset, CRYPT14_OFFSETS, get_status_location
from Whatsapp_Chat_Exporter.utility import get_chat_condition, slugify
try:
import zlib
@@ -53,7 +52,7 @@ def _extract_encrypted_key(keyfile):
return _generate_hmac_of_hmac(key_stream)
def decrypt_backup(database, key, output, crypt=Crypt.CRYPT14, show_crypt15=False):
def decrypt_backup(database, key, output, crypt=Crypt.CRYPT14, show_crypt15=False, db_type=DbType.MESSAGE):
if not support_backup:
return 1
if isinstance(key, io.IOBase):
@@ -83,8 +82,12 @@ def decrypt_backup(database, key, output, crypt=Crypt.CRYPT14, show_crypt15=Fals
if len(database) < 131:
raise ValueError("The crypt15 file must be at least 131 bytes")
t1 = t2 = None
iv = database[8:24]
db_offset = database[0] + 2 # Skip protobuf + protobuf size and backup type
if db_type == DbType.MESSAGE:
iv = database[8:24]
db_offset = database[0] + 2 # Skip protobuf + protobuf size and backup type
elif db_type == DbType.CONTACT:
iv = database[7:23]
db_offset = database[0] + 1 # Skip protobuf + protobuf size
db_ciphertext = database[db_offset:]
if t1 != t2:
@@ -165,18 +168,33 @@ def contacts(db, data):
row = c.fetchone()
def messages(db, data, media_folder):
def messages(db, data, media_folder, timezone_offset, filter_date, filter_chat):
# Get message history
c = db.cursor()
try:
c.execute("""SELECT count() FROM messages""")
c.execute(f"""SELECT count()
FROM messages
WHERE 1=1
{f'AND timestamp {filter_date}' if filter_date is not None else ''}
{get_chat_condition(filter_chat[0], True, "messages.key_remote_jid")}
{get_chat_condition(filter_chat[1], False, "messages.key_remote_jid")}""")
except sqlite3.OperationalError:
c.execute("""SELECT count() FROM message""")
c.execute(f"""SELECT count()
FROM message
LEFT JOIN chat
ON chat._id = message.chat_row_id
INNER JOIN jid
ON jid._id = chat.jid_row_id
WHERE 1=1
{f'AND timestamp {filter_date}' if filter_date is not None else ''}
{get_chat_condition(filter_chat[0], True, "jid.raw_string")}
{get_chat_condition(filter_chat[1], False, "jid.raw_string")}""")
total_row_number = c.fetchone()[0]
print(f"Processing messages...(0/{total_row_number})", end="\r")
try:
c.execute("""SELECT messages.key_remote_jid,
c.execute(f"""SELECT messages.key_remote_jid,
messages._id,
messages.key_from_me,
messages.timestamp,
@@ -200,7 +218,7 @@ def messages(db, data, media_folder):
jid_new.raw_string as new_jid,
jid_global.type as jid_type,
group_concat(receipt_user.receipt_timestamp) as receipt_timestamp,
group_concat(message.received_timestamp) as received_timestamp,
group_concat(messages.received_timestamp) as received_timestamp,
group_concat(receipt_user.read_timestamp) as read_timestamp,
group_concat(receipt_user.played_timestamp) as played_timestamp,
group_concat(messages.read_device_timestamp) as read_device_timestamp
@@ -226,11 +244,15 @@ def messages(db, data, media_folder):
LEFT JOIN receipt_user
ON receipt_user.message_row_id = messages._id
WHERE messages.key_remote_jid <> '-1'
GROUP BY message._id;"""
{f'AND messages.timestamp {filter_date}' if filter_date is not None else ''}
{get_chat_condition(filter_chat[0], True, "messages.key_remote_jid")}
{get_chat_condition(filter_chat[1], False, "messages.key_remote_jid")}
GROUP BY messages._id
ORDER BY messages.timestamp ASC;"""
)
except sqlite3.OperationalError:
try:
c.execute("""SELECT jid_global.raw_string as key_remote_jid,
c.execute(f"""SELECT jid_global.raw_string as key_remote_jid,
message._id,
message.from_me as key_from_me,
message.timestamp,
@@ -290,6 +312,9 @@ def messages(db, data, media_folder):
LEFT JOIN receipt_user
ON receipt_user.message_row_id = message._id
WHERE key_remote_jid <> '-1'
{f'AND message.timestamp {filter_date}' if filter_date is not None else ''}
{get_chat_condition(filter_chat[0], True, "key_remote_jid")}
{get_chat_condition(filter_chat[1], False, "key_remote_jid")}
GROUP BY message._id;"""
)
except Exception as e:
@@ -320,6 +345,7 @@ def messages(db, data, media_folder):
timestamp=content["timestamp"],
time=content["timestamp"],
key_id=content["key_id"],
timezone_offset=timezone_offset
)
if isinstance(content["data"], bytes):
message.data = ("The message is binary data and its base64 is "
@@ -453,15 +479,36 @@ def messages(db, data, media_folder):
print(f"Processing messages...({total_row_number}/{total_row_number})", end="\r")
def media(db, data, media_folder):
def media(db, data, media_folder, filter_date, filter_chat, separate_media=True):
# Get media
c = db.cursor()
c.execute("""SELECT count() FROM message_media""")
try:
c.execute(f"""SELECT count()
FROM message_media
INNER JOIN messages
ON message_media.message_row_id = messages._id
WHERE 1=1
{f'AND messages.timestamp {filter_date}' if filter_date is not None else ''}
{get_chat_condition(filter_chat[0], True, "messages.key_remote_jid")}
{get_chat_condition(filter_chat[1], False, "messages.key_remote_jid")}""")
except sqlite3.OperationalError:
c.execute(f"""SELECT count()
FROM message_media
INNER JOIN message
ON message_media.message_row_id = message._id
LEFT JOIN chat
ON chat._id = message.chat_row_id
INNER JOIN jid
ON jid._id = chat.jid_row_id
WHERE 1=1
{f'AND message.timestamp {filter_date}' if filter_date is not None else ''}
{get_chat_condition(filter_chat[0], True, "jid.raw_string")}
{get_chat_condition(filter_chat[1], False, "jid.raw_string")}""")
total_row_number = c.fetchone()[0]
print(f"\nProcessing media...(0/{total_row_number})", end="\r")
i = 0
try:
c.execute("""SELECT messages.key_remote_jid,
c.execute(f"""SELECT messages.key_remote_jid,
message_row_id,
file_path,
message_url,
@@ -474,11 +521,16 @@ def media(db, data, media_folder):
ON message_media.message_row_id = messages._id
LEFT JOIN media_hash_thumbnail
ON message_media.file_hash = media_hash_thumbnail.media_hash
INNER JOIN jid
ON messages.key_remote_jid = jid.raw_string
WHERE jid.type <> 7
{f'AND messages.timestamp {filter_date}' if filter_date is not None else ''}
{get_chat_condition(filter_chat[0], True, "messages.key_remote_jid")}
{get_chat_condition(filter_chat[1], False, "messages.key_remote_jid")}
ORDER BY messages.key_remote_jid ASC"""
)
except sqlite3.OperationalError:
c.execute("""SELECT jid.raw_string as key_remote_jid,
c.execute(f"""SELECT jid.raw_string as key_remote_jid,
message_row_id,
file_path,
message_url,
@@ -496,6 +548,9 @@ def media(db, data, media_folder):
LEFT JOIN media_hash_thumbnail
ON message_media.file_hash = media_hash_thumbnail.media_hash
WHERE jid.type <> 7
{f'AND message.timestamp {filter_date}' if filter_date is not None else ''}
{get_chat_condition(filter_chat[0], True, "key_remote_jid")}
{get_chat_condition(filter_chat[1], False, "key_remote_jid")}
ORDER BY jid.raw_string ASC"""
)
content = c.fetchone()
@@ -516,6 +571,15 @@ def media(db, data, media_folder):
message.mime = "application/octet-stream"
else:
message.mime = content["mime_type"]
if separate_media:
chat_display_name = slugify(data[content["key_remote_jid"]].name or message.sender \
or content["key_remote_jid"].split('@')[0], True)
current_filename = file_path.split("/")[-1]
new_folder = os.path.join(media_folder, "separated", chat_display_name)
Path(new_folder).mkdir(parents=True, exist_ok=True)
new_path = os.path.join(new_folder, current_filename)
shutil.copy2(file_path, new_path)
message.data = new_path
else:
if False: # Block execution
try:
@@ -545,20 +609,24 @@ def media(db, data, media_folder):
f"Processing media...({total_row_number}/{total_row_number})", end="\r")
def vcard(db, data, media_folder):
def vcard(db, data, media_folder, filter_date, filter_chat):
c = db.cursor()
try:
c.execute("""SELECT message_row_id,
c.execute(f"""SELECT message_row_id,
messages.key_remote_jid,
vcard,
messages.media_name
FROM messages_vcards
INNER JOIN messages
ON messages_vcards.message_row_id = messages._id
WHERE 1=1
{f'AND messages.timestamp {filter_date}' if filter_date is not None else ''}
{get_chat_condition(filter_chat[0], True, "messages.key_remote_jid")}
{get_chat_condition(filter_chat[1], False, "messages.key_remote_jid")}
ORDER BY messages.key_remote_jid ASC;"""
)
except sqlite3.OperationalError:
c.execute("""SELECT message_row_id,
c.execute(f"""SELECT message_row_id,
jid.raw_string as key_remote_jid,
vcard,
message.text_data as media_name
@@ -569,6 +637,10 @@ def vcard(db, data, media_folder):
ON chat._id = message.chat_row_id
INNER JOIN jid
ON jid._id = chat.jid_row_id
WHERE 1=1
{f'AND message.timestamp {filter_date}' if filter_date is not None else ''}
{get_chat_condition(filter_chat[0], True, "key_remote_jid")}
{get_chat_condition(filter_chat[1], False, "key_remote_jid")}
ORDER BY message.chat_row_id ASC;"""
)
@@ -579,7 +651,7 @@ def vcard(db, data, media_folder):
if not os.path.isdir(path):
Path(path).mkdir(parents=True, exist_ok=True)
for index, row in enumerate(rows):
media_name = row["media_name"] if row["media_name"] is not None else ""
media_name = row["media_name"] if row["media_name"] is not None else "Undefined vCard File"
file_name = "".join(x for x in media_name if x.isalnum())
file_name = file_name.encode('utf-8')[:230].decode('utf-8', 'ignore')
file_path = os.path.join(path, f"{file_name}.vcf")
@@ -587,22 +659,30 @@ def vcard(db, data, media_folder):
with open(file_path, "w", encoding="utf-8") as f:
f.write(row["vcard"])
message = data[row["key_remote_jid"]].messages[row["message_row_id"]]
message.data = media_name + \
"The vCard file cannot be displayed here, " \
f"however it should be located at {file_path}"
message.data = "This media include the following vCard file(s):<br>" \
f'<a href="{htmle(file_path)}">{htmle(media_name)}</a>'
message.mime = "text/x-vcard"
message.meta = True
message.safe = True
print(f"Processing vCards...({index + 1}/{total_row_number})", end="\r")
def calls(db, data):
def calls(db, data, timezone_offset, filter_chat):
c = db.cursor()
c.execute("""SELECT count() FROM call_log""")
c.execute(f"""SELECT count()
FROM call_log
INNER JOIN jid
ON call_log.jid_row_id = jid._id
LEFT JOIN chat
ON call_log.jid_row_id = chat.jid_row_id
WHERE 1=1
{get_chat_condition(filter_chat[0], True, "jid.raw_string")}
{get_chat_condition(filter_chat[1], False, "jid.raw_string")}""")
total_row_number = c.fetchone()[0]
if total_row_number == 0:
return
print(f"\nProcessing calls...({total_row_number})", end="\r")
c.execute("""SELECT call_log._id,
c.execute(f"""SELECT call_log._id,
jid.raw_string,
from_me,
call_id,
@@ -616,7 +696,10 @@ def calls(db, data):
INNER JOIN jid
ON call_log.jid_row_id = jid._id
LEFT JOIN chat
ON call_log.jid_row_id = chat.jid_row_id"""
ON call_log.jid_row_id = chat.jid_row_id
WHERE 1=1
{get_chat_condition(filter_chat[0], True, "jid.raw_string")}
{get_chat_condition(filter_chat[1], False, "jid.raw_string")}"""
)
chat = ChatStore(Device.ANDROID, "WhatsApp Calls")
content = c.fetchone()
@@ -626,6 +709,7 @@ def calls(db, data):
timestamp=content["timestamp"],
time=content["timestamp"],
key_id=content["call_id"],
timezone_offset=timezone_offset
)
_jid = content["raw_string"]
name = data[_jid].name if _jid in data else content["chat_subject"] or None

View File

@@ -1,10 +1,19 @@
#!/usr/bin/python3
import os
from datetime import datetime
from datetime import datetime, tzinfo, timedelta
from typing import Union
class TimeZone(tzinfo):
def __init__(self, offset):
self.offset = offset
def utcoffset(self, dt):
return timedelta(hours=self.offset)
def dst(self, dt):
return timedelta(0)
class ChatStore():
def __init__(self, type, name=None, media=None):
if name is not None and not isinstance(name, str):
@@ -55,15 +64,15 @@ class ChatStore():
class Message():
def __init__(self, from_me: Union[bool,int], timestamp: int, time: Union[int,float,str], key_id: int):
def __init__(self, from_me: Union[bool,int], timestamp: int, time: Union[int,float,str], key_id: int, timezone_offset: int = 0):
self.from_me = bool(from_me)
self.timestamp = timestamp / 1000 if timestamp > 9999999999 else timestamp
if isinstance(time, int) or isinstance(time, float):
self.time = datetime.fromtimestamp(time/1000).strftime("%H:%M")
self.time = datetime.fromtimestamp(self.timestamp, TimeZone(timezone_offset)).strftime("%H:%M")
elif isinstance(time, str):
self.time = time
else:
raise TypeError("Time must be a string or integer")
raise TypeError("Time must be a string or number")
self.media = False
self.key_id = key_id
self.meta = False

View File

@@ -1,11 +1,13 @@
#!/usr/bin/python3
import os
import shutil
from glob import glob
from pathlib import Path
from mimetypes import MimeTypes
from markupsafe import escape as htmle
from Whatsapp_Chat_Exporter.data_model import ChatStore, Message
from Whatsapp_Chat_Exporter.utility import APPLE_TIME, Device
from Whatsapp_Chat_Exporter.utility import APPLE_TIME, Device, get_chat_condition, slugify
def contacts(db, data):
@@ -18,26 +20,33 @@ def contacts(db, data):
content = c.fetchone()
while content is not None:
if not content["ZWHATSAPPID"].endswith("@s.whatsapp.net"):
_id = content["ZWHATSAPPID"] + "@s.whatsapp.net"
data[_id] = ChatStore(Device.IOS)
data[_id].status = content["ZABOUTTEXT"]
ZWHATSAPPID = content["ZWHATSAPPID"] + "@s.whatsapp.net"
data[ZWHATSAPPID] = ChatStore(Device.IOS)
data[ZWHATSAPPID].status = content["ZABOUTTEXT"]
content = c.fetchone()
def messages(db, data, media_folder):
def messages(db, data, media_folder, timezone_offset, filter_date, filter_chat):
c = db.cursor()
# Get contacts
c.execute("""SELECT count() FROM ZWACHATSESSION""")
c.execute(f"""SELECT count()
FROM ZWACHATSESSION
WHERE 1=1
{get_chat_condition(filter_chat[0], True, "ZWACHATSESSION.ZCONTACTJID")}
{get_chat_condition(filter_chat[1], False, "ZWACHATSESSION.ZCONTACTJID")}""")
total_row_number = c.fetchone()[0]
print(f"Processing contacts...({total_row_number})")
c.execute(
"""SELECT ZCONTACTJID,
f"""SELECT ZCONTACTJID,
ZPARTNERNAME,
ZPUSHNAME
FROM ZWACHATSESSION
LEFT JOIN ZWAPROFILEPUSHNAME
ON ZWACHATSESSION.ZCONTACTJID = ZWAPROFILEPUSHNAME.ZJID;"""
ON ZWACHATSESSION.ZCONTACTJID = ZWAPROFILEPUSHNAME.ZJID
WHERE 1=1
{get_chat_condition(filter_chat[0], True, "ZWACHATSESSION.ZCONTACTJID")}
{get_chat_condition(filter_chat[1], False, "ZWACHATSESSION.ZCONTACTJID")};"""
)
content = c.fetchone()
while content is not None:
@@ -65,11 +74,17 @@ def messages(db, data, media_folder):
content = c.fetchone()
# Get message history
c.execute("""SELECT count() FROM ZWAMESSAGE""")
c.execute(f"""SELECT count()
FROM ZWAMESSAGE
INNER JOIN ZWACHATSESSION
ON ZWAMESSAGE.ZCHATSESSION = ZWACHATSESSION.Z_PK
WHERE 1=1
{f'AND ZMESSAGEDATE {filter_date}' if filter_date is not None else ''}
{get_chat_condition(filter_chat[0], True, "ZWACHATSESSION.ZCONTACTJID")}
{get_chat_condition(filter_chat[1], False, "ZWACHATSESSION.ZCONTACTJID")}""")
total_row_number = c.fetchone()[0]
print(f"Processing messages...(0/{total_row_number})", end="\r")
c.execute("""SELECT COALESCE(ZFROMJID, ZTOJID) as _id,
c.execute(f"""SELECT ZCONTACTJID,
ZWAMESSAGE.Z_PK,
ZISFROMME,
ZMESSAGEDATE,
@@ -77,38 +92,48 @@ def messages(db, data, media_folder):
ZMESSAGETYPE,
ZWAGROUPMEMBER.ZMEMBERJID,
ZMETADATA,
ZSTANZAID
ZSTANZAID,
ZGROUPINFO
FROM ZWAMESSAGE
LEFT JOIN ZWAGROUPMEMBER
ON ZWAMESSAGE.ZGROUPMEMBER = ZWAGROUPMEMBER.Z_PK
LEFT JOIN ZWAMEDIAITEM
ON ZWAMESSAGE.Z_PK = ZWAMEDIAITEM.ZMESSAGE;""")
ON ZWAMESSAGE.Z_PK = ZWAMEDIAITEM.ZMESSAGE
INNER JOIN ZWACHATSESSION
ON ZWAMESSAGE.ZCHATSESSION = ZWACHATSESSION.Z_PK
WHERE 1=1
{f'AND ZMESSAGEDATE {filter_date}' if filter_date is not None else ''}
{get_chat_condition(filter_chat[0], True, "ZCONTACTJID")}
{get_chat_condition(filter_chat[1], False, "ZCONTACTJID")}
ORDER BY ZMESSAGEDATE ASC;""")
i = 0
content = c.fetchone()
while content is not None:
_id = content["_id"]
ZCONTACTJID = content["ZCONTACTJID"]
Z_PK = content["Z_PK"]
if _id not in data:
data[_id] = ChatStore(Device.IOS)
path = f'{media_folder}/Media/Profile/{_id.split("@")[0]}'
is_group_message = content["ZGROUPINFO"] is not None
if ZCONTACTJID not in data:
data[ZCONTACTJID] = ChatStore(Device.IOS)
path = f'{media_folder}/Media/Profile/{ZCONTACTJID.split("@")[0]}'
avatars = glob(f"{path}*")
if 0 < len(avatars) <= 1:
data[_id].their_avatar = avatars[0]
data[ZCONTACTJID].their_avatar = avatars[0]
else:
for avatar in avatars:
if avatar.endswith(".thumb"):
data[_id].their_avatar_thumb = avatar
data[ZCONTACTJID].their_avatar_thumb = avatar
elif avatar.endswith(".jpg"):
data[_id].their_avatar = avatar
data[ZCONTACTJID].their_avatar = avatar
ts = APPLE_TIME + content["ZMESSAGEDATE"]
message = Message(
from_me=content["ZISFROMME"],
timestamp=ts,
time=ts, # TODO: Could be bug
key_id=content["ZSTANZAID"][:17],
timezone_offset=timezone_offset
)
invalid = False
if "-" in _id and content["ZISFROMME"] == 0:
if is_group_message and content["ZISFROMME"] == 0:
name = None
if content["ZMEMBERJID"] is not None:
if content["ZMEMBERJID"] in data:
@@ -124,7 +149,7 @@ def messages(db, data, media_folder):
message.sender = None
if content["ZMESSAGETYPE"] == 6:
# Metadata
if "-" in _id:
if is_group_message:
# Group
if content["ZTEXT"] is not None:
# Chnaged name
@@ -173,7 +198,7 @@ def messages(db, data, media_folder):
msg = msg.replace("\n", "<br>")
message.data = msg
if not invalid:
data[_id].add_message(Z_PK, message)
data[ZCONTACTJID].add_message(Z_PK, message)
i += 1
if i % 1000 == 0:
print(f"Processing messages...({i}/{total_row_number})", end="\r")
@@ -182,14 +207,24 @@ def messages(db, data, media_folder):
f"Processing messages...({total_row_number}/{total_row_number})", end="\r")
def media(db, data, media_folder):
def media(db, data, media_folder, filter_date, filter_chat, separate_media=False):
c = db.cursor()
# Get media
c.execute("""SELECT count() FROM ZWAMEDIAITEM""")
c.execute(f"""SELECT count()
FROM ZWAMEDIAITEM
INNER JOIN ZWAMESSAGE
ON ZWAMEDIAITEM.ZMESSAGE = ZWAMESSAGE.Z_PK
INNER JOIN ZWACHATSESSION
ON ZWAMESSAGE.ZCHATSESSION = ZWACHATSESSION.Z_PK
WHERE 1=1
{f'AND ZMESSAGEDATE {filter_date}' if filter_date is not None else ''}
{get_chat_condition(filter_chat[0], True, "ZWACHATSESSION.ZCONTACTJID")}
{get_chat_condition(filter_chat[1], False, "ZWACHATSESSION.ZCONTACTJID")}
""")
total_row_number = c.fetchone()[0]
print(f"\nProcessing media...(0/{total_row_number})", end="\r")
i = 0
c.execute("""SELECT COALESCE(ZWAMESSAGE.ZFROMJID, ZWAMESSAGE.ZTOJID) as _id,
c.execute(f"""SELECT ZCONTACTJID,
ZMESSAGE,
ZMEDIALOCALPATH,
ZMEDIAURL,
@@ -199,15 +234,19 @@ def media(db, data, media_folder):
FROM ZWAMEDIAITEM
INNER JOIN ZWAMESSAGE
ON ZWAMEDIAITEM.ZMESSAGE = ZWAMESSAGE.Z_PK
INNER JOIN ZWACHATSESSION
ON ZWAMESSAGE.ZCHATSESSION = ZWACHATSESSION.Z_PK
WHERE ZMEDIALOCALPATH IS NOT NULL
ORDER BY _id ASC""")
{f'AND ZWAMESSAGE.ZMESSAGEDATE {filter_date}' if filter_date is not None else ''}
{get_chat_condition(filter_chat[0], True, "ZCONTACTJID")}
{get_chat_condition(filter_chat[1], False, "ZCONTACTJID")}
ORDER BY ZCONTACTJID ASC""")
content = c.fetchone()
mime = MimeTypes()
while content is not None:
file_path = f"{media_folder}/Message/{content['ZMEDIALOCALPATH']}"
_id = content["_id"]
ZMESSAGE = content["ZMESSAGE"]
message = data[_id].messages[ZMESSAGE]
message = data[content["ZCONTACTJID"]].messages[ZMESSAGE]
message.media = True
if os.path.isfile(file_path):
message.data = file_path
@@ -219,6 +258,15 @@ def media(db, data, media_folder):
message.mime = "application/octet-stream"
else:
message.mime = content["ZVCARDSTRING"]
if separate_media:
chat_display_name = slugify(data[content["ZCONTACTJID"]].name or message.sender \
or content["ZCONTACTJID"].split('@')[0], True)
current_filename = file_path.split("/")[-1]
new_folder = os.path.join(media_folder, "separated", chat_display_name)
Path(new_folder).mkdir(parents=True, exist_ok=True)
new_path = os.path.join(new_folder, current_filename)
shutil.copy2(file_path, new_path)
message.data = new_path
else:
if False: # Block execution
try:
@@ -244,37 +292,55 @@ def media(db, data, media_folder):
f"Processing media...({total_row_number}/{total_row_number})", end="\r")
def vcard(db, data, media_folder):
def vcard(db, data, media_folder, filter_date, filter_chat):
c = db.cursor()
c.execute("""SELECT DISTINCT ZWAVCARDMENTION.ZMEDIAITEM,
c.execute(f"""SELECT DISTINCT ZWAVCARDMENTION.ZMEDIAITEM,
ZWAMEDIAITEM.ZMESSAGE,
COALESCE(ZWAMESSAGE.ZFROMJID,
ZWAMESSAGE.ZTOJID) as _id,
ZCONTACTJID,
ZVCARDNAME,
ZVCARDSTRING
FROM ZWAVCARDMENTION
INNER JOIN ZWAMEDIAITEM
ON ZWAVCARDMENTION.ZMEDIAITEM = ZWAMEDIAITEM.Z_PK
INNER JOIN ZWAMESSAGE
ON ZWAMEDIAITEM.ZMESSAGE = ZWAMESSAGE.Z_PK""")
ON ZWAMEDIAITEM.ZMESSAGE = ZWAMESSAGE.Z_PK
INNER JOIN ZWACHATSESSION
ON ZWAMESSAGE.ZCHATSESSION = ZWACHATSESSION.Z_PK
WHERE 1=1
{f'AND ZWAMESSAGE.ZMESSAGEDATE {filter_date}' if filter_date is not None else ''}
{get_chat_condition(filter_chat[0], True, "ZCONTACTJID")}
{get_chat_condition(filter_chat[1], False, "ZCONTACTJID")};""")
contents = c.fetchall()
total_row_number = len(contents)
print(f"\nProcessing vCards...(0/{total_row_number})", end="\r")
path = f'{media_folder}/Message/vCards'
if not os.path.isdir(path):
Path(path).mkdir(parents=True, exist_ok=True)
Path(path).mkdir(parents=True, exist_ok=True)
for index, content in enumerate(contents):
file_name = "".join(x for x in content["ZVCARDNAME"] if x.isalnum())
file_name = file_name.encode('utf-8')[:230].decode('utf-8', 'ignore')
file_path = os.path.join(path, f"{file_name}.vcf")
if not os.path.isfile(file_path):
with open(file_path, "w", encoding="utf-8") as f:
f.write(content["ZVCARDSTRING"])
message = data[content["_id"]].messages[content["ZMESSAGE"]]
message.data = content["ZVCARDNAME"] + \
"The vCard file cannot be displayed here, " \
f"however it should be located at {file_path}"
file_paths = []
vcard_names = content["ZVCARDNAME"].split("_$!<Name-Separator>!$_")
vcard_strings = content["ZVCARDSTRING"].split("_$!<VCard-Separator>!$_")
# If this is a list of contacts
if len(vcard_names) > len(vcard_strings):
vcard_names.pop(0) # Dismiss the first element, which is the group name
for name, vcard_string in zip(vcard_names, vcard_strings):
file_name = "".join(x for x in name if x.isalnum())
file_name = file_name.encode('utf-8')[:230].decode('utf-8', 'ignore')
file_path = os.path.join(path, f"{file_name}.vcf")
file_paths.append(file_path)
if not os.path.isfile(file_path):
with open(file_path, "w", encoding="utf-8") as f:
f.write(vcard_string)
vcard_summary = "This media include the following vCard file(s):<br>"
vcard_summary += " | ".join([f'<a href="{htmle(fp)}">{htmle(name)}</a>' for name, fp in zip(vcard_names, file_paths)])
message = data[content["ZCONTACTJID"]].messages[content["ZMESSAGE"]]
message.data = vcard_summary
message.mime = "text/x-vcard"
message.media = True
message.meta = True
message.safe = True
print(f"Processing vCards...({index + 1}/{total_row_number})", end="\r")

View File

@@ -3,6 +3,8 @@ import json
import os
from bleach import clean as sanitize
from markupsafe import Markup
import unicodedata
import re
from datetime import datetime
from enum import IntEnum
from Whatsapp_Chat_Exporter.data_model import ChatStore
@@ -156,6 +158,16 @@ def get_file_name(contact: str, chat: ChatStore):
return "".join(x for x in file_name if x.isalnum() or x in "- "), name
def get_chat_condition(filter, include, column):
if filter is not None:
if include:
return f'''AND ({' OR '.join(f"{column} LIKE '%{chat}%'" for chat in filter)})'''
else:
return f'''AND ({' AND '.join(f"{column} NOT LIKE '%{chat}%'" for chat in filter)})'''
else:
return ""
# Android Specific
CRYPT14_OFFSETS = (
{"iv": 67, "db": 191},
@@ -172,6 +184,11 @@ class Crypt(IntEnum):
CRYPT12 = 12
class DbType(StrEnum):
MESSAGE = "message"
CONTACT = "contact"
def brute_force_offset(max_iv=200, max_db=200):
for iv in range(0, max_iv):
for db in range(0, max_db):
@@ -294,6 +311,23 @@ def setup_template(template, no_avatar):
APPLE_TIME = datetime.timestamp(datetime(2001, 1, 1))
def slugify(value, allow_unicode=False):
"""
Taken from https://github.com/django/django/blob/master/django/utils/text.py
Convert to ASCII if 'allow_unicode' is False. Convert spaces or repeated
dashes to single dashes. Remove characters that aren't alphanumerics,
underscores, or hyphens. Convert to lowercase. Also strip leading and
trailing whitespace, dashes, and underscores.
"""
value = str(value)
if allow_unicode:
value = unicodedata.normalize('NFKC', value)
else:
value = unicodedata.normalize('NFKD', value).encode('ascii', 'ignore').decode('ascii')
value = re.sub(r'[^\w\s-]', '', value.lower())
return re.sub(r'[-\s]+', '-', value).strip('-_')
class WhatsAppIdentifier(StrEnum):
MESSAGE = "7c7fba66680ef796b916b067077cc246adacf01d"
CONTACT = "b8548dc30aa1030df0ce18ef08b882cf7ab5212f"

20
docs.html Normal file
View File

@@ -0,0 +1,20 @@
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="refresh" content="0; url='https://github.com/KnugiHK/WhatsApp-Chat-Exporter/wiki'" />
<script type="text/javascript">
destination = {
"filter": "Filter",
"date": "Filters#date-filters",
"chat": "Filters#chat-filter",
"osl": "Open-Source-Licenses"
null: ""
};
const dest = new URLSearchParams(window.location.search).get('dest');
window.location.href = `https://github.com/KnugiHK/WhatsApp-Chat-Exporter/wiki/${destination[dest]}`;
</script>
</head>
<body>
<p>If the redirection doesn't work, you can find the documentation at <a href="https://github.com/KnugiHK/WhatsApp-Chat-Exporter/wiki">https://github.com/KnugiHK/WhatsApp-Chat-Exporter/wiki</a>.</p>
</body>
</html>

View File

@@ -0,0 +1,48 @@
import hmac
import javaobj
import zlib
from Crypto.Cipher import AES
from hashlib import sha256
def _generate_hmac_of_hmac(key_stream):
key = hmac.new(
hmac.new(
b'\x00' * 32,
key_stream,
sha256
).digest(),
b"backup encryption\x01",
sha256
)
return key.digest(), key_stream
def _extract_encrypted_key(keyfile):
key_stream = b""
for byte in javaobj.loads(keyfile):
key_stream += byte.to_bytes(1, "big", signed=True)
return _generate_hmac_of_hmac(key_stream)
key = open("encrypted_backup.key", "rb").read()
database = open("wa.db.crypt15", "rb").read()
main_key, hex_key = _extract_encrypted_key(key)
for i in range(100):
iv = database[i:i+16]
for j in range(100):
cipher = AES.new(main_key, AES.MODE_GCM, iv)
db_ciphertext = database[j:]
db_compressed = cipher.decrypt(db_ciphertext)
try:
db = zlib.decompress(db_compressed)
except zlib.error:
...
else:
if db[0:6] == b"SQLite":
print(f"Found!\nIV: {i}\nOffset: {j}")
print(db_compressed[:10])
exit()
print("Not found! Try to increase maximum search.")