Monitoring Raw X11 Communication or why Chromium opens 7 Xorg connections
This is partly a continuation of previous article on this topic: Exploring Xorg connections
While exploring Xorg and its connections I stumbled upong interesting fact that some apps open multiple connections to Xorg even though they have just one window. For example chromium had 7, skype 8, firefox and telegram had each 2 connections. Since Skype connection count was suspisiously close to chromeium I immediately assumed that it is all because of Electron since it is based on chromium. Once I opened vscode and indeed it had 8 connections which leads to assumption that all chromium based (electron) applications open 7+ Xorg connections.
This got me interested in getting to know what those 7 connections were. Since chromium had 7 connections and the rest of chromium based apps were 8 connections I thought that chromium opens 7 connections and all others just open one extra for themself. This lead me to look into chromium as it was common factor between all of them.
Initially it looked like chromium used one connection either for each window or for each tab. But later experiments showed that no matther amount of chromium windows or tabs open, number of connections is remained exactly 7. This leads to a different assumption that Chromium might split connection by functionality. But how to verify it? I first started with grepping chromium source code as I thought it was the easiest way but after 30-50 minutes it looked that the source code would take more time to figure out than to just reverse it by looking at the protocol communication level.
To my big surprise unix sockets are not supported neither by tcpdump nor by wireshark. This totally broke my feeling that I can easily just plug some ready made tool, enter my root password and have full access to the information. After some time of googling and then “talking” to various chatbots I still didn’t find an easy way. But then I remembered that X11 is client-server model and must support TCP (something that I am more comfortable with) and it still does.
DISCLAIMER: If you decide to turn on TCP communication on yoru Xorg server, I do not take any responsibility and fully discourage doing so unless you know what you are doing.
To force X server to work through TCP we need to change how it starts. On my system there is a file responsible for starting X located at /etc/X11/xinit/xserverrc. In that file you need to ask X server not to listen on unix and local connections. And since tcp connections are probably disable by default you need to ask X to enable it (without it won’t start). So for myself I put this line in my xserverrc:
exec /usr/bin/X -nolisten unix -nolisten local -listen tcp "$@"
To make it work just reboot the system or kill the X server just dont’ forget to save all data before doing so.
After start X server again you can list all connections and grep for “x11” or “6000”. In my current configuration ss shows port 6000 as x11 when it lists all connections. This is how it might look like:
If you didn’t read previous post here is how Xorg connections look like normally (on my computer they are always abstract unix sockets):
So now as we forced everybody to work over TCP we can wireshark to monitor communication between clients and server. Just boot up wireshark and ask it to monitor everything on Loopback interface. This will create a lot of logs so my advice is just log enough interactions with the system and then save it to a local file and work with “offline” version. I did the same and recorded just over 1 minute of interaction with chromium where I first start monitoring then opened Chromium and did several operations like opening and closing windows, opening and closing tabs, copy pasting and the then closed Chromium all together. This is how it looked:
At then end 72 secons of wireshark session ended up as one file of 740 Mb of pcap data which consist of 120890 packets. The is big bowl of data points which needs to be filtered to be of any use. First I filtered out anything that is not related to X11 protocol by using x11 filter in the display filter bar. This alone reduced sample size to 78960 packets and removed 34% of TCP overhead and other unrelated data.
So next is just filter by each connection and see what happened in each one. The trick is similar to what has been done in previous post. Filter all connections that has x11 or 6000 port as a destination port. It is a lot simpler than previous script (but I was lazy and used old script and change some params there even though it is less efficient). This is what I got when I run the script on my machine.
16: | 37008 - chromium | pid:203388 -1 - chromium
17: | 37002 - chromium | pid:203388 -1 - chromium
18: | 37050 - chromium | pid:203423 -1 - chromium
19: | 37022 - chromium | pid:203423 -1 - chromium
20: | 37030 - chromium | pid:203423 -1 - chromium
21: | 37062 - chromium | pid:203423 -1 - chromium
22: | 37036 - chromium | pid:203423 -1 - chromium
The first column give just an index in the list of results. Second column is source port and fourth is process ID. So started chromium window has 2 processes one of which has two connections and the other has four connections with Xorg. In order to explore communication or each connection then we just need to filter by x11 protocol and specific port like this:
x11 && tcp.port == 37008
In total my interactions with chromium created 65747 packets which is slighly more than half of all the X11 packets captured. I didn’t measure but I guess second largest traffic user could be wireshark itself. If we filter just Chromium in 72 seconds it gives us almost 7.5 Megabytes of X11 only traffic which give us on average 105 Kbs X11 traffic.
port 37008 | 341 packets | 79754 bytes
port 37002 | 19145 packets | 3459550 bytes
port 37050 | 45670 packets | 4130920 bytes
port 37022 | 14 packets | 15920 bytes
port 37030 | 252 packets | 62120 bytes
port 37062 | 14 packets | 15920 bytes
port 37036 | 38 packets | 23908 bytes
-----------------------------------------------------
sum: | 65747 bytes | 7788092 bytes, 7.42 Mb
From the stats above we can see that there is some heavy lifting from one of the port of each of the 2 processes. It could be hat port 37002 is doing input processing and port 37050 is responsible for all drawing operations.
From one of my interactive sessions where I run wireshark and monitored each port for activity when I did something with window I had these notes on packet chagnes.
- port 1: Nothing - Map notify and Atoms setup
- port 2: Keyboard input. More packets on input
- port 3: Window attribute changes
- port 4: Nothing
- port 5: Constantly sends something when window is visible and nothing when invisible.
- port 6: Nothing
- port 7: Nothing
From the capture above port 37008 just sets InternAtoms, queries for some extensions and recieves some notifications such as PropertyNotify and xkb-MapNotify. Here is the list of requested Atoms:
UTF8_STRING
WM _CLIENT_LEADER
WM_DELETE_WINDOW
WM_LOCALE_NAME
WM_PROTOCOLS
WM_TAKE_FOCUS
WM_WINDOW_ROLE_NET_ACTIVE_WINDOW
_NET_CURRENT_DESKTOP
_NET_FRAME_EXTENTS
_NET_STARTUP_ID
_NET_WM_CM_S0
_NET_WM_DESKTOP
_NET_WM_ICON
_NET_WM_ICON_NAME
_NET_WM_NAME
_NET_WM_PID
_NET_WM_PING
_NET_WM_STATE
_NET_WM_STATE_ABOVE
_NET_WM_STATE_BELOW
_NET_WM_STATE_FULLSCREEN
_NET_WM_STATE_HIDDEN
_NET_WM_STATE_MODAL
_NET_WM_STATE_MAXIMIZED_VERT
_NET_WM_STATE_MAXIMIZED_HORZ
_NET_WM_STATE_SKIP_TASKBAR
_NET_WM_STATE_SKIP_PAGER
_NET_WM_STATE_STICKY
_NET_WM_SYNC_REQUEST
_NET_WM_SYNC_REQUEST_COUNTER
_NET_WM_WINDOW_TYPE
_NET_WM_WINDOW_TYPE_COMBO
_NET_WM_WINDOW
_TYPE_DIALOG
_NET_WM_WINDOW_TYPE_DND!
_NET_WM_WINDOW_TYPE_DROPDOWN_MENU
_NET_WM_WINDOW
_TYPE_MENU
_NET_WM_WINDOW
_TYPE_NORMAL
_NET_WM_WINDOW_TYPE_POPUP_MENU
_NET_WM_WINDOW_TYPE_TOOLTIP
_NET_WM_WINDOW
_TYPE_UTILITY
_NET_WM_USER_TIME
_NET_WM_USER_TIME_WINDOW
_NET_VIRTUAL_ROOTSGDK_SELECTION
_NET_WM_STATE_FOCUSEDGDK_VISUALS
For the port 37002 after querying for extensions, seting up some Atoms and then getting events GenericEvent, PropertyNotify, QueryPointer and others. My best attemp at restoring AtomList from byte stream is here:
ATOM_PAIR
Abs Dbl End Timestamp
Abs Dbl Fling X Velocity
Abs Dbl Fling Y Velocity
Abs Dbl Metrics Data 1
Abs Dbl Metrics Data 2
Abs Dbl Ordinal XAbs Dbl Ordinal Y
Abs Dbl Start TimestampAbs Finger Count
Abs Fling StateAbs MT Orientation
Abs MT Position XAbs MT Position Y
Abs MT Pressure
Abs MT Touch Major
Abs MT Touch Minor
Abs MT Tracking ID
Abs Metrics Type
CHECK
CHOME_SELECTION
CHROME_SELECTION
CHROMIUM_COMPOSITE_WINDOW
CHROMIUM_TIMESTAMP
CLIPBOARD
CLIPBOARD_MANAGER
Content ProtectionDesired
Device Node
Device Product ID
EDIDEnabled
FAKE_SELECTION
Full aspect
_GTK_FRAME_EXTENTS
INCR
KEYBOARD
LOCKMOUSEMULTIPLE
Rel Horiz Wheel
Rel Vert Wheel
SAVE_TARGETS
SELECTION_STRING
TARGET1
TARGET2
TARGETS
TEXT
TIMESTAMP
TOUCHPAD
TOUCHSCREEN
Tap Paused
Touch
Timestamp
UTF8_STRING
Undesired
WM_DELETE_WINDOW
WM_PROTOCOLS
WM_WINDOW_ROLE
XdndActionAsk
XdndActionCopy
XdndActionDirectSave
XdndActionLink
XdndActionList
XdndActionMove
XdndActionPrivate
XdndAwareXdndDirectSave0
XdndDropXdndEnter
XdndFinishedXdndLeave
XdndPosition
XdndProxy
XdndSelection
XdndStatusXdndTypeList
_CHROME_DISPLAY_INTERNAL
_CHROME_DISPLAY_ROTATION
_CHROME_DISPLAY_SCALE_FACTOR
_CHROMIUM_DRAG
_RECEIVER
!
_GTK_HIDE_TITLEBAR_WHEN_MAXIMIZED
_GTK_THEME_VARIANT_ICC_PROFILE
_MOTIF_WM_HINTS
_NETSCAPE_URL
_NET_ACTIVE_WINDOW
_NET_CURRENT_DESKTOP
_NET_FRAME
_EXTENTS
_NET_STARTUP_INFO
_NET_STARTUP_INFO_BEGIN
_NET_SUPPORTED
_NET_SUPPORTING_WM_CHECK
_NET_SYSTEM_TRAY_OPCODE
_NET_SYSTEM_TRAY_S0
_NET_SYSTEM_TRAY_VISUAL
_NET_WM_BYPASS
_COMPOSITOR
_NET_WM_CM
_S0
_NET_WM_DESKTOP
_NET_WM_ICON
_NET_WM_MOVERESIZE
_NET_WM_NAME
_NET_WM_OPAQUE_REGION
_NET_WM_PID
_NET_WM_PING
_NET_WM_STATE
_NET_WM_STATE_ABOVE
_NET_WM_STATE_FOCUSED
_NET_WM_STATE_FULLSCREEN
_NET_WM_STATE_HIDDEN
_NET_WM_STATE_MAXIMIZED_HORZ
_NET_WM_STATE_MAXIMIZED_VERT
_NET_WM_STATE_SKIP_TASKBAR
_NET_WM_STATE_STICKY
_NET_WM_SYNC_REQUEST
_NET_WM_SYNC_REQUEST_COUNTER
_NET_WM_USER_TIME
_NET_WM_WINDOW_OPACITY
_NET_WM_WINDOW_TYPE
_NET_WM_WINDOW
_TYPE_DIALOG
_NET_WM_WINDOW_TYPE_DND
_NET_WM_WINDOW_TYPE_MENU
_NET_WM_WINDOW_TYPE_NORMAL
_NET_WM_WINDOW_TYPE_NOTIFICATION
_NET_WM_WINDOW
_TYPE_TOOLTIP
_NET_WORKAREA
_SCREENSAVER_STATUS
_SCREENSAVER_VERSION
_XEMBED_INFO
application/octet-stream
application/vndchromiumtest
chromium/filename
chromium/x-bookmark-entries
chromium/x-browser-actions
chromium/x-file-system-files
chromium/x-pepper-custom-data
chromium/x-renderer-taint
chromium/x-web-custom-data
chromium/x-webkit-paste
image/png
image/svg+xml
marker_event
scaling mode
text/html
text/plain
text/plain
;charset=utf-8
text/rtf
text/uri-list
text/x-moz-url
xwayland-pointer
xwayland-keyboard
xwayland-touch
The port 37050 is the one related to drawing. It queried some extensions and then mostly worked with GLX and DRI2, requested GetGeometry, XFree-VidModeExtension, GetInputFocus and recived events such as dri2-InvalidateBuffers.
Port 37022 didn’t do anything of value. From the log it is visible that it initialized at the beginning and didn’t do anything for the rest of interaction. It so small that I could list all of its interactions in one image:
Port 37030 was a bit more involved than previous one. For some reason it started with request of the same atoms as in port 37002 request. Maybe port port 37002 from first process setup the Atoms and filled with data for second process to read it? Apart from this it just tried some GetProperty and QueryTree.
Port 37062 is pretty has as few actions as port 37002. Did a setup at the begining and nothing else.
Port 37036 is also a “passive” connection which queries for some extensions and then does nothing. It looks like it queries extensions that were already queried by previous connections.
From what I have seen so far I do not see anything that would require having separate connections. It might be some optimization techniques which I don’t know about. It would have been good if someone wrote a reasoning behind this structure. (I searched online but either I am bad at searching or it just not there yet).
What about Firefox?
I did the same but shorter exploration with firexox. It uses one process and two connections to connection to Xorg. First connection seems to be realated specifically related to input and everything related to window management and second used for drawing only. In the port related to input it setups Atoms related to window management like WM_DELETE_WINDOW, WM_LOCALE_NAME, WM_PROTOCOLS and etc. Apart from this there were nothing out of ordinary.
How much connections other applications open
I didn’t do deep exploration for other applications but here is some short information about apps that I use almost every day.
vlc - 2 connections
gpicview - 1 connection
skype - 8 connections spread between 2 processes (3 + 5)
telegram-desktop - 2 connections
inkscape - 1 connection
gimp - 1 connection
lowriter (soffice.bin) - 1 connection
localc (soffice.bin) - 1 connection
st (terminal) - 1 connection
android-studio (java) - 2 connections 1 process
vscode - 8 connections, 2 processes. (3 connections for 'code-oss', 5 connections
for 'electron' named processes)
For Chromium and Firefox amount of opened windows and tabs does not affect number of connections they always remain static. For for simpler apps like for exampel st (terminal) it opens new process and new connection for each of its windows.
Conclusion
I got involved with this crazy but super fun idea by accidentally stumbling upon that closing skype closed a bunch of Xorg connections. Which made me interested in finding why skype needs so much connections in the first place. This lead to understanding that it was chromium who was responsible for openning all this connections and skype was just using it.
I looked at how to force Xorg to allow only TCP connections so that look at its traffic through wireshark (don’t forget to turn it off) and then analyze what chromium was doing with all those connections. From very surface exploration without digging into details and looking into chromiums source code it looks like it just some kind of optimization technique. There is one connection for drawing and one for input.
This post does not open something new. Chromium source code is open and you can find this by reading the code. But this is just a diffrenet way to look at it and was fun way to waste my time. I hope you had your fun looking at it and didn’t waste yours.