FixBrowser

Blog

FixBrowser: Inside the rich text component (+web demo) (2025/02/10)

This blog post describes the rich text component used in FixBrowser web browser to display HTML pages and for interaction with the user.

About FixBrowser

FixBrowser is a lightweight web browser created from scratch. It is designed from ground up with privacy in mind. This is achieved by having a whitelist approach where only the resources strictly needed by the website are loaded and by having JavaScript disabled (wait! read the next paragraph before dismissing it).

Many websites are however not usable with JavaScript disabled, therefore FixBrowser contains an updated set of fix scripts that fix and even improve various websites as well as groups of websites (using the same common technology such as WordPress, Disqus forums, etc.).

FixBrowser is not usable for practical usage yet. However you can use the FixProxy tool that uses the "backend" part of the browser (everything other than the actual rendering/layouting) in a regular web browser. I've been using this proxy as my primary way of web browsing for multiple years with good results.

The approach

The complexity of implementing a web browser from scratch was greatly reduced by not supporting JavaScript. This creates a cascade effect of simplification on every component and the overall architecture.

In a regular browser the complexity is very high because JavaScript can trigger any kind of update at any time. This requires to maintain complex data structures that store the relationships between the HTML, CSS and view models:

HTML model - the "physical" layout of the web page
CSS model - applied styling using the cascade rules based on HTML element hierarchy (this already adds quite a complexity)
view model - represents the blocks containing "paragraphs" (can be simply divs) and inline elements (such as text) in it, it must handle float elements and layers (this one has quite different structure from the other models and simple changes to CSS properties can trigger complex changes in the view model)

The view model maintains it's own hierarchy by having a stacking context. It is hierarchical to allow things like having position: relative to act as an "anchor" to which nested positioned elements are relative to.

When JavaScript is not supported at all it allows to decouple all the models from each other and follow the UNIX philosophy of "do one thing and do it well". Instead of having everything intermixed together you can process each model separately. This means the processing can go in a straight way: HTML → CSS → view model.

There is no need to maintain complex data structures that allow to change HTML or CSS dynamically. This saves both CPU and memory usage and is much simpler to implement. For example applying of the CSS cascade rules can be achieved by a simple recursion. The layers can be reduced to a simple linear list instead of being hierarchical. After the processing is done, only the data structures of the view model remain.

Rich text component

The rich text component is designed as a toolkit for implementing your own representations. It can be used for a simple edit box or for a complex rendering of the web pages or anything between. More complex representations (such as tables) are achieved by nesting the rich text layers.

The core of the component is a Layer. It contains zero or more blocks (eg. paragraphs) where each block contains inline text, inline objects or floating objects. Both blocks and inline objects have attached their style information.

The inline objects are layouted into separate rows. The text is split if it's longer than a single row. Each row has the height of the tallest object present on the row and text/objects are aligned to a common baseline.

The structure of Layer

Block

The model is available in the browser/richtext/model.fix file. It actually consists of just a single class Block that contains a packed array of structures containing flows. There are two kinds of flows: text and objects. Text is handled specially as it can be word wrapped. Each flow contains an opaque reference to style, this is not interpreted by the rich text component in any way, just passed along for your purposes. The Block has also it's own style attached to it.

The provided methods add, set and remove allow to manipulate the flows. And the get_flow_* methods to retrieve information about the stored flows.

Layer

To create a Layer you need to pass an array of Blocks along with the instance of a ModelToView class. This class converts the representation in Blocks to view representation using the LineBuilder class. This class is responsible for layouting the individual rows of text and inline objects to given width. Whenever the Layer is resized this process is repeated.

You can look at an example how the ModelToView can be implemented here.

LineBuilder

It supports these operations:

adding and clearing of floats on the left or right side
adding text with word wrapping and inline objects
adding multiple draw layers for fancy backgrounds and foregrounds
aligning inline parts to center/right or spreading it to the edges (justify)

Each inline part has it's own handler that is responsible for sizing, painting and handling of the events from the user.

Demo

The demo shows a test of the rich text component. It shows these features:

floats and the layouting of them in a non-trivial configuration
word wrapping the text including treating text that has different styles but no spaces between as one "word"
inserting buttons and text fields as inline objects, allowing for interactive elements
testing the overdraw feature that allows inline objects to be bigger than the row (eg. to support shadows)
support for tables by using nested layers and having the table as an inline object
support for overflowing when some inline objects is wider than the layer (horizontal scrollbar appears)
ability to highlight the internal parts that the text and inline objects were converted to

The model of the shown content is defined in code here.

How it is used for rendering of HTML pages

The features of the rich text component is all that is needed to support rendering full HTML pages. Some examples of how things are handled:

the block elements (such as divs) are converted to a single Block containing an inline object with full width and having the content in a nested Layer
inline block elements (display: inline-block) are handled like the block elements with the difference that they are not put in a dedicated Block but are alongside other inline elements
the CSS layers and the stacking context is resolved and flattened to a linear list of layers and mapped to the rich text layers
the hierarchical elements for inline elements (such as spans and other formatting elements) are flattened to a linear representation
styling of inline elements (such as nested borders and backgrounds) are converted to a linear representation as well, the hierarchical nature is then emulated in an inline part handler (gives the right visual result but the rich text component just sees a linear list of inline parts)
tables are handled similarly to block elements, each cell has it's own Layer to show the content of the cell
relative positioning (display: relative) uses a "mirror" mechanism: the original Layer is layouted as normal but the drawing is redirected to another layer that is shown at a different position
sometimes the width/height of the resulting layout is needed to be calculated (such as to calculate the widths of table cells), this is handled by having a special mode that tries to layout the Layer for a very wide width and the actual used width is obtained (some operations are omitted in this mode, such as aligning of the text or anything that would try to "fill" the whole width)

The result is that the rich text component can be fully developed and all edge cases resolved without any concern of the complexities of rendering HTML pages. This would be impossible with JavaScript support as that would require to handle everything in one giant interconnected mess instead of having a nice separation of concerns.

Unicode support

My general stance about Unicode support is to simplify the processing of Unicode in a way that makes it simple to use by other code. If that can't be done there is a question if that particular feature should be supported. I'm willing to not support every language out there if otherwise it would mean that the implementation complexity would spike as that has a cascade effect on everything.

Characters in Unicode are hard. They can be a single code point, or a combination of multiple code points. Many characters can be represented in multiple ways (eg. the precomposed vs decomposed characters) but they should be treated as equal. Each "true" character, as perceived from an user perspective, is called a grapheme cluster.

They should be treated as a single character, that means selection and editing actions should treat it as an opaque block. However when writing characters it needs to be temporarily handled specially as the user needs to add code points to compose the final character by using a specific input method.

Grapheme clusters

As explained before the overall approach to the design is to follow the UNIX philosophy of "doing one thing and doing it well". Mixing multiple aspects together, such as handling complexity of Unicode with rich text handling, is against this philosophy.

Instead I've came up with a simple solution (finding such simple solutions can be surprisingly hard though) that decouples these problems. FixBrowser is written in an extensible language (FixScript) that allows a wide range of extensions both directly by the language and by adjusting the native runtime. This allowed to implement a dynamic allocation of characters that represent whole grapheme clusters. It even integrates with the garbage collector (GC) to deallocate characters that are no longer used.

This way the rich text implementation can be blissfully ignorant about the whole concept and work with grapheme clusters as single characters like in the good old pre-Unicode days when you dealt with just one language/encoding.

Right-to-left languages

I have not done any implementation work on this yet but from the preliminary research it doesn't appear to be that complicated to add. It affects the direction of layouting of inline objects and how the selection is done. Something that can be directly handled by the rich text component without adding too much complexity (famous last words).

Sponsoring

If you like this project you can support it by donating, allowing me to work on it more and make FixBrowser usable for actual browsing.

As a donator you can submit what websites you would like to be supported (either in the form below or at any later time). You will also get access to GateOpener service.

Please select additional areas that you wish to be implemented the most (read FAQ section for more information):

integration of CEF - allow to use a full browser experience for selected websites or tabs
video playback support - support for playing videos using the <video> tag and YouTube
extensions support - a good support for extensions including ability to use native code
improvements of FixProxy - if you prefer using FixProxy and would like to see it improved with more features

Extra suggestions (eg. what websites or features to support etc.):

E-mail: (optional, used solely for contacting you in relation to the donation)

Donation amount: (you will need to enter it again on PayPal website)

You can donate using PayPal account or debit/credit card (no PayPal account required).

Comments

No comments.

Add comment

The website was designed for modern browsers and IE4+.

Name:
Content:
Confirmation code:

	(type the letters shown in the picture)