A Forensic Gold Mine III: Forensic Analysis of the Microsoft Teams Desktop Client
As part of my master’s thesis at Abertay University, I’d spent most of the past three months digging through the artefacts generated by Microsoft Teams Desktop Client throughout the application usage and analysing how these could be used in a forensic investigation. My research showed that Microsoft Teams stores an abundance of information, both metadata and user-generated artefacts, that can prove extremely valuable. As my thesis turned out quite technical and is still in the publication process, this post should provide a first overview of my findings. I will also introduce you to my brand-new Autopsy parser for Microsoft Teams that allows extracting communication artefacts, such as messages, contacts and call logs programmatically.
Microsoft Teams’ Directory Structure
AppData Local Installation Directory
The most common installation method 1 for Microsoft Teams is probably the one-click Squirrel-Framework based installer retrievable from https://www.microsoft.com/en-ww/microsoft-teams/download-app that installs all the application files to the user’s
C:\Users\%USERNAME%\AppData\Local\Microsoft\Teams directory. The benefit of installing to the user profile rather than the global program folder is that this does not require the user to have administrative priviledges to perform the installation.
If the client is installed to the user-profile, the Teams directory looks similar to this:
As the names of the files and folders might not be totally obvious, here is my brief description of the functionality and usage of each of the files:
currentdirectory contains the extracted program files of the currently installed copy of Teams
packagesdirectory contains the nugget package (basically a compressed copy of the program files with some metadata) that gets installed by the Squirrel Framework
previousdirectory contains the program files of the previously installed copy of Teams. The Squirrel framework keeps these as a backup to rollback in case the update process fails. This directory should be empty if Teams is installed for the first time.
app.icois just an icon for Microsoft Teams.
Ressources.priappears to be a Package Resource Index File generated by Visual Studio.
setup.jsoncontains a single parameter referencing a Microsoft Teams executable.
SquirrelSetup.logis an append log that keeps track of Squirrel’s check for newer versions, successful installations and updates. This file could be used for timeline analysis to track correlate starts of the application with usage activities.
Update.exeis the executable that the Microsoft Teams shortcut in the start menu and on the desktop link to. This executable is part of the Squirrel framework and is responsible for installing and updating the application components at launch. Ideally, this happens transparently to the user. Once the
Update.exehas performed its checks, it launches the actual
Teams.exelocated within the
Update.VisualElementsManifest.xmlis an XML file that contains a few references to logos of the application.
AppData Roaming Usage Artefacts
The actual juicy artefacts, the user-generated content, is distributed across various files within the
C:\Users\%USERNAME%\AppData\Roaming\Microsoft\Teams directory. The following figure shows the different directories and files that can commonly be found within the directory. Based on my research, however, it is worth noting that the number and type of the artefacts may vary depending on the version, the tenant type and application usage.
Most notable are the following:
Cookiesfile is an SQLite database, which stores session cookies. In the current version, the cookies are still unencrypted.
desktop-config.jsonis a JSON file that contains configuration settings, such as the accountholders name, the public IP address and the account’s email addresses.
installTime.txtis a plain-text file that contains the date when Microsoft Teams has been first installed on the client.
logs.txtis continiously written append log which stores the communication with the middleware. This file typically contains a large number of timestamps including their timezone information. Based on this file it is possible to restore the application launch, shutdown and even the resising of the application window.
storage.jsonis a JSON file that had an auth_tenant_users_map field with an entry for each user that had logged into teams. This included their full name, email address, and the public IP address from which they logged in last. Additionally, it has a property called auth_time, which is an EPOCH timestamp that indicates when a user is first successfully authenticated within Microsoft Teams.
IndexedDBdirectory contains a LevelDB database that, among other entries, stores the messages, posts, comments, contacts, appointments and call logs.
Session Storagedirectory contains another LevelDB database which stores the JWT tokens that are used for authenticating the client against the server.
Local Storagedirectory contains the third LevelDB database utilised by Microsoft Teams. Among other entries, the local storage database keeps track of the file transfers and also contains message drafts.
Cachedirectory is home to a Google DiskCache that contains (mostly) cached copies of user-generated artefacts, such as cached copies of the profile pictures but also thumbnails of files that had been exchanged.
Network Persistent Statefile contains various URLs and network quality indicators. If the client was connected over wifi, it might also contain the access point’s SSID as a base64 encoded string.
Processing the Forensic Artefacts
There is quite a considerable number of files and directories to sift through - actually way more than I could cover in this post. Therefore, I will be focusing on the most relevant ones, namely the IndexedDB LevelDB database stored in the
IndexeDB folder and the Chromium DiskCache located within the
Cache folder. A full-blown discussion of all the files of forensic interest can be found within my thesis that is hopefully soon to be published.
The https_teams.microsoft.com_0.indexeddb.leveldb LevelDB database within the
Even if you are not interested in the nitty-gritty details of the LevelDB databases, you should know that LevelDB databases use an append log that contains data for storing the most recent transactions that can grow up to a size of 4 MB. Once the .log file has reached its maximum size, the records get deduplicated and compressed into one or more higher level ldb files. This detail is crucial as this step increases the entropy makes string searches highly ineffective for the higher level files.
Dumping the IndexedDB LevelDB Database
A major hurdle when trying to investigate IndexedDB LevelDB databases is the lack of robust, versatile and publicly available tools that allow navigating the databases and extracting its content. As part of my thesis I’d tried various different tools, such as FastNoSQL, which only yielded a jumbled mess. Therefore, I’d developed a few Python scripts based on the excellent ccl_chrome_indexeddb Python library that allowed me to easily process the Microsoft Teams IndexedDB database and access records that were located in both the log and ldb files.
The first script is simply called dump_leveldb.py script that allows dumping a LevelDB’s records to a JSON file. The usage is extremly simple. All that has to be specified is the path to the LevelDB database and the path where the JSON file should be written.
The dumped records all follow the same setup. They have:
keywhich identifies the record.
origin_filewith the path to the file where the record was found. This could be the .log or one of the .ldb files.
storethat refers to the object store in which the record was located. Object stores can be compared to database tables in relational database.
valuethat contains the actual data of a specific record.
Database Structure and Object Stores
Based on my analysis Microsoft Teams utilises at least the following databases and object stores. The relevant flag is my own estimation of whether or not an object store is of forensic interest. Please note that the number of databases and object stores may vary depending on the platform that is investigated. Furthermore, it might be possible that the names of the object stores might change during upcoming releases.
Database Records of Interest
Let’s have a look at a couple of dumped database records to understand what data could be recovered. If you want to follow along, you can find a copy of the IndexedDB database on my GitHub and simply use the previously previously discussed dump_leveldb.py script for dumping the database.
A major concern during a forensic investigation will always be to retrace communication a suspect was involved in. Luckily for us, Microsoft Teams keeps a copy of the exchanged text messages, comments and posts within
replychains object store. A record of a message that has previously been exchanged might look like something like this:
As you can see, even a simple text message has quite a few properties. From a forensic perspective, the most interesting properties are the following:
renderContentcontains the message body of the text message.
creatorholds the user ID of the author of the message.
creatorProfilecontains a dictionary with details on the author including first name, last name and UPN.
conversationIdidentifies the thread on which a message has been sent.
composetimestores the timestamp when the message was originally authored.
isFromMeflag indicates the direction of a message, whether it was outgoing or incoming.
Files that were sent between users can be tracked based on the dictionary under
attachments. The following figure, for example, shows the record of a file called bagpipes.mp4, which had been exchanged between the two users.
Again, a couple of properties are especially important.
objectUrlcontains the remote server address of the resource. Though, it’s worth noting that files are typically not publicly accessible and valid user credentials with appropriate permissions are required to access the file stored on SharePoint.
titlestores the filename of the file that has been transfered.
typerefers to the filetype of the exchange file.
Similarly to messages, can contacts also be found within the LevelDB database. Unlike messages, these, are stored in a different object store called
people. Their structure look like this:
Of interest might be the following properties:
mriis the user ID of a specific contact. Can be used to map a contact to a text message.
userPrincipalNamecontains the UPN of the contact.
givenNamefull name of the contact. Includes both first and last name.
Textual messages are good, but how about call logs you may ask? Luckily for us, Microsoft Teams also keeps a record of these and precisely logs these within the
replychains object store. A single record of an accepted, outgoing call looks like this:
All the interesting call log data, such as the start end end time of the call, the call direction, call type are conveniently stored as a JSON array accessible through the dictionary key
Automated Parsing of the IndexedDB LevelDB Database within Autopsy Forensic Suite
If you don’t like fiddling with command-line tools and manually browsing through thousands of lines of JSON files, then I’ve got good news for you. As part of my thesis I’ve also been working on an Autopsy ingest module that automtically extracts all the relevant records and turns these into Blackboard artefacts. More information on the Autopsy parser can be found on my projects website under forensics.im or within the demo video on YouTube.
Microsoft Teams uses DiskCache data structures for caching content various contents, such as application files, icons but also user generated content on the client side. The user-generated ones were all located within the
Cache folder and could be inspected using ChromeCacheView which allowed convenient access to the records distributed across the cache files.
Among the files that got persisted in the cache are the profile pictures of the account holders, thumbnails for links with Twitter cards functionality and the previews of images that were exchanged. One aspect that stood out is that the thumbnails of the exchanged images all had the same file name called 1.jfif, which makes them fairly easy to identify. The following figure shows the thumbnail of an image that had been exchanged as part of a conversation that has occured as a direct message.
I thoroughly enjoyed my forensic investigation of Microsoft Teams and was quite astonished at just how much potential evidence could be recovered from Microsoft Teams’ Desktop Client. Even though this post could only scratch the surface of what I’d covered in my thesis, I still hope it provides enough information to get your own investigation started. In the future, I am also planning to inspect the files of the other Microsoft Teams desktop clients, such as the one for macOS and Linux. Once I come around to do that, I will update the post accordingly.
Microsoft Teams can also be installed machine-wide if installed through the MSI file. In this case, the install locations would be
C:\Program Files (x86)\Teams Installeror in
C:\Program Files\Teams Installerdepending on the architecture. More information on this deployment method can be found here https://docs.microsoft.com/en-us/microsoftteams/msi-deployment ↩︎
Caithness, A. (2020, September 23). Hang on! That’s not SQLite! Chrome, Electron and LevelDB. https://www.cclsolutionsgroup.com//post/hang-on-thats-not-sqlite-chrome-electron-and-leveldb ↩︎