Monitoring Raw X11 Communication or why Chromium opens 7 Xorg connections
This is partly a continuation of previous article on this topic: Exploring Xorg connections
While exploring Xorg and its connections I stumbled upong interesting fact that some apps open multiple connections to Xorg even though they have just one window. For example chromium had 7, skype 8, firefox and telegram had each 2 connections. Since Skype connection count was suspisiously close to chromeium I immediately assumed that it is all because of Electron since it is based on chromium. Once I opened vscode and indeed it had 8 connections which leads to assumption that all chromium based (electron) applications open 7+ Xorg connections.
This got me interested in getting to know what those 7 connections were. Since chromium had 7 connections and the rest of chromium based apps were 8 connections I thought that chromium opens 7 connections and all others just open one extra for themself. This lead me to look into chromium as it was common factor between all of them.
Initially it looked like chromium used one connection either for each window or for each tab. But later experiments showed that no matther amount of chromium windows or tabs open, number of connections is remained exactly 7. This leads to a different assumption that Chromium might split connection by functionality. But how to verify it? I first started with grepping chromium source code as I thought it was the easiest way but after 30-50 minutes it looked that the source code would take more time to figure out than to just reverse it by looking at the protocol communication level.
To my big surprise unix sockets are not supported neither by tcpdump nor by wireshark. This totally broke my feeling that I can easily just plug some ready made tool, enter my root password and have full access to the information. After some time of googling and then “talking” to various chatbots I still didn’t find an easy way. But then I remembered that X11 is client-server model and must support TCP (something that I am more comfortable with) and it still does.
DISCLAIMER: If you decide to turn on TCP communication on yoru Xorg server, I do not take any responsibility and fully discourage doing so unless you know what you are doing.
To force X server to work through TCP we need to change how it starts. On my system there is a file responsible for starting X located at /etc/X11/xinit/xserverrc. In that file you need to ask X server not to listen on unix and local connections. And since tcp connections are probably disable by default you need to ask X to enable it (without it won’t start). So for myself I put this line in my xserverrc:
exec /usr/bin/X -nolisten unix -nolisten local -listen tcp "$@"
To make it work just reboot the system or kill the X server just dont’ forget to save all data before doing so.
After start X server again you can list all connections and grep for “x11” or “6000”. In my current configuration ss shows port 6000 as x11 when it lists all connections. This is how it might look like:
If you didn’t read previous post here is how Xorg connections look like normally (on my computer they are always abstract unix sockets):
So now as we forced everybody to work over TCP we can wireshark to monitor communication between clients and server. Just boot up wireshark and ask it to monitor everything on Loopback interface. This will create a lot of logs so my advice is just log enough interactions with the system and then save it to a local file and work with “offline” version. I did the same and recorded just over 1 minute of interaction with chromium where I first start monitoring then opened Chromium and did several operations like opening and closing windows, opening and closing tabs, copy pasting and the then closed Chromium all together. This is how it looked:
At then end 72 secons of wireshark session ended up as one file of 740 Mb of pcap data which consist of 120890 packets. The is big bowl of data points which needs to be filtered to be of any use. First I filtered out anything that is not related to X11 protocol by using x11 filter in the display filter bar. This alone reduced sample size to 78960 packets and removed 34% of TCP overhead and other unrelated data.
So next is just filter by each connection and see what happened in each one. The trick is similar to what has been done in previous post. Filter all connections that has x11 or 6000 port as a destination port. It is a lot simpler than previous script (but I was lazy and used old script and change some params there even though it is less efficient). This is what I got when I run the script on my machine.
16: | 37008 - chromium | pid:203388 -1 - chromium
17: | 37002 - chromium | pid:203388 -1 - chromium
18: | 37050 - chromium | pid:203423 -1 - chromium
19: | 37022 - chromium | pid:203423 -1 - chromium
20: | 37030 - chromium | pid:203423 -1 - chromium
21: | 37062 - chromium | pid:203423 -1 - chromium
22: | 37036 - chromium | pid:203423 -1 - chromium
The first column give just an index in the list of results. Second column is source port and fourth is process ID. So started chromium window has 2 processes one of which has two connections and the other has four connections with Xorg. In order to explore communication or each connection then we just need to filter by x11 protocol and specific port like this:
x11 && tcp.port == 37008
In total my interactions with chromium created 65747 packets which is slighly more than half of all the X11 packets captured. I didn’t measure but I guess second largest traffic user could be wireshark itself. If we filter just Chromium in 72 seconds it gives us almost 7.5 Megabytes of X11 only traffic which give us on average 105 Kbs X11 traffic.
port 37008 | 341 packets | 79754 bytes
port 37002 | 19145 packets | 3459550 bytes
port 37050 | 45670 packets | 4130920 bytes
port 37022 | 14 packets | 15920 bytes
port 37030 | 252 packets | 62120 bytes
port 37062 | 14 packets | 15920 bytes
port 37036 | 38 packets | 23908 bytes
sum: | 65747 bytes | 7788092 bytes, 7.42 Mb
From the stats above we can see that there is some heavy lifting from one of the port of each of the 2 processes. It could be hat port 37002 is doing input processing and port 37050 is responsible for all drawing operations.
From one of my interactive sessions where I run wireshark and monitored each port for activity when I did something with window I had these notes on packet chagnes.
- port 1: Nothing - Map notify and Atoms setup
- port 2: Keyboard input. More packets on input
- port 3: Window attribute changes
- port 4: Nothing
- port 5: Constantly sends something when window is visible and nothing when invisible.
- port 6: Nothing
- port 7: Nothing
From the capture above port 37008 just sets InternAtoms, queries for some extensions and recieves some notifications such as PropertyNotify and xkb-MapNotify. Here is the list of requested Atoms:
For the port 37002 after querying for extensions, seting up some Atoms and then getting events GenericEvent, PropertyNotify, QueryPointer and others. My best attemp at restoring AtomList from byte stream is here:
Abs Dbl End Timestamp
Abs Dbl Fling X Velocity
Abs Dbl Fling Y Velocity
Abs Dbl Metrics Data 1
Abs Dbl Metrics Data 2
Abs Dbl Ordinal XAbs Dbl Ordinal Y
Abs Dbl Start TimestampAbs Finger Count
Abs Fling StateAbs MT Orientation
Abs MT Position XAbs MT Position Y
Abs MT Pressure
Abs MT Touch Major
Abs MT Touch Minor
Abs MT Tracking ID
Abs Metrics Type
Device Product ID
Rel Horiz Wheel
Rel Vert Wheel
The port 37050 is the one related to drawing. It queried some extensions and then mostly worked with GLX and DRI2, requested GetGeometry, XFree-VidModeExtension, GetInputFocus and recived events such as dri2-InvalidateBuffers.
Port 37022 didn’t do anything of value. From the log it is visible that it initialized at the beginning and didn’t do anything for the rest of interaction. It so small that I could list all of its interactions in one image:
Port 37030 was a bit more involved than previous one. For some reason it started with request of the same atoms as in port 37002 request. Maybe port port 37002 from first process setup the Atoms and filled with data for second process to read it? Apart from this it just tried some GetProperty and QueryTree.
Port 37062 is pretty has as few actions as port 37002. Did a setup at the begining and nothing else.
Port 37036 is also a “passive” connection which queries for some extensions and then does nothing. It looks like it queries extensions that were already queried by previous connections.
From what I have seen so far I do not see anything that would require having separate connections. It might be some optimization techniques which I don’t know about. It would have been good if someone wrote a reasoning behind this structure. (I searched online but either I am bad at searching or it just not there yet).
What about Firefox?
I did the same but shorter exploration with firexox. It uses one process and two connections to connection to Xorg. First connection seems to be realated specifically related to input and everything related to window management and second used for drawing only. In the port related to input it setups Atoms related to window management like WM_DELETE_WINDOW, WM_LOCALE_NAME, WM_PROTOCOLS and etc. Apart from this there were nothing out of ordinary.
How much connections other applications open
I didn’t do deep exploration for other applications but here is some short information about apps that I use almost every day.
vlc - 2 connections
gpicview - 1 connection
skype - 8 connections spread between 2 processes (3 + 5)
telegram-desktop - 2 connections
inkscape - 1 connection
gimp - 1 connection
lowriter (soffice.bin) - 1 connection
localc (soffice.bin) - 1 connection
st (terminal) - 1 connection
android-studio (java) - 2 connections 1 process
vscode - 8 connections, 2 processes. (3 connections for 'code-oss', 5 connections
for 'electron' named processes)
For Chromium and Firefox amount of opened windows and tabs does not affect number of connections they always remain static. For for simpler apps like for exampel st (terminal) it opens new process and new connection for each of its windows.
I got involved with this crazy but super fun idea by accidentally stumbling upon that closing skype closed a bunch of Xorg connections. Which made me interested in finding why skype needs so much connections in the first place. This lead to understanding that it was chromium who was responsible for openning all this connections and skype was just using it.
I looked at how to force Xorg to allow only TCP connections so that look at its traffic through wireshark (don’t forget to turn it off) and then analyze what chromium was doing with all those connections. From very surface exploration without digging into details and looking into chromiums source code it looks like it just some kind of optimization technique. There is one connection for drawing and one for input.
This post does not open something new. Chromium source code is open and you can find this by reading the code. But this is just a diffrenet way to look at it and was fun way to waste my time. I hope you had your fun looking at it and didn’t waste yours.